ΠΡΠΎΠ³ΡΠ°ΠΌΠΌΠΈΡΡΡ
Π΄ΠΈΡΡΠ°Π½ΡΠΈΠΎΠ½Π½ΠΎ
Π΄ΠΎΠ³ΠΎΠ²ΠΎΡΠ½Π°Ρ
Data Science. ΠΠΎΠ΄ Π΄Π»Ρ ΠΏΠ°ΡΡΠ΅ΡΠ° ΡΡΡΠ»ΠΎΠΊ. Π―Π·ΡΠΊ Π½Π΅ ΠΈΠΌΠ΅Π΅Ρ Π·Π½Π°ΡΠ΅Π½ΠΈΡ. ΠΡΠΆΠ΅Π½ ΠΏΠ°ΡΡΠ΅Ρ ΡΡΡΠ»ΠΎΠΊ ΡΠΎΡΠΎΠ³ΡΠ°ΡΠΈΠΉ ΠΈΠ· ΠΏΠΎΡΡΠ° ΠΠ. ΠΠ°ΠΏΡΠΈΠΌΠ΅Ρ, Ρ Π²ΡΡΠ°Π²Π»ΡΡ ΡΡΡΠ»ΠΊΡ Π½Π° ΠΏΠΎΡΡ https://vk.com/wall-58509583_1347991 ΠΈ ΠΌΠ½Π΅ Π²ΡΡ
ΠΎΠ΄ΡΡ ΡΠ°ΠΊΠΈΠ΅ ΡΡΡΠ»ΠΊΠΈ: https://vk.com/wall-58509583_1347991?z=photo-58509583_458739935%2Fwall-58509583_1347991 https://vk.com/wall-58509583_1347991?z=photo-58509583_458739936%2Fwall-58509583_1347991 https://vk.com/wall-58509583_1347991?z=photo-58509583_458739937%2Fwall-58509583_1347991 https://vk.com/wall-58509583_1347991?z=photo-58509583_458739938%2Fwall-58509583_1347991 https://vk.com/wall-58509583_1347991?z=photo-58509583_458739939%2Fwall-58509583_1347991 ΠΠ½Π΅ Π½Π΅ΠΉΡΠΎΡΠ΅ΡΡ Β«GigaChatΒ» ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠΈΠ»Π° ΠΊΠΎΠ΄ Π½Π° Python, Π½ΠΎ ΠΎΠ½ Π½Π΅ ΡΡΠ°Π±ΠΎΡΠ°Π» ΡΠΊΠΎΠ»ΡΠΊΠΎ Π±Ρ ΡΠ°Π· Ρ Π΅Π³ΠΎ Π½Π΅ ΠΈΡΠΏΡΠ°Π²Π»ΡΠ». import requests from lxml import html def parse_post_links(post_url, output_file='photo_links.txt'): # Π‘ΠΎΠ·Π΄Π°Π½ΠΈΠ΅ ΡΠ΅ΡΡΠΈΠΈ session = requests.Session() # ΠΡΠΏΡΠ°Π²Π»ΡΠ΅ΠΌ Π·Π°ΠΏΡΠΎΡ Π½Π° ΠΏΠΎΠ»ΡΡΠ΅Π½ΠΈΠ΅ ΠΊΠΎΠ½ΡΠ΅Π½ΡΠ° ΡΡΡΠ°Π½ΠΈΡΡ response = session.get(post_url) # ΠΠ°ΡΡΠΈΠΌ HTML Π΄ΠΎΠΊΡΠΌΠ΅Π½Ρ tree = html.fromstring(response.content) # ΠΠ·Π²Π»Π΅ΠΊΠ°Π΅ΠΌ Π²ΡΠ΅ ΡΡΡΠ»ΠΊΠΈ Π½Π° ΡΠΎΡΠΎΠ³ΡΠ°ΡΠΈΠΈ photo_links = tree.xpath('//div[@class="page_block"]//a[starts-with(@href, "/wall-")]') # Π‘ΠΎΡ
ΡΠ°Π½ΡΠ΅ΠΌ ΡΡΡΠ»ΠΊΠΈ Π² ΡΠ°ΠΉΠ» with open(output_file, 'w', encoding='utf-8') as f: for link in photo_links: if '/photo-' in link.attrib['href']: full_link = f'https://vk.com{link.attrib["href"]}\n' f.write(full_link) if __name__ == "__main__": post_url = "https://vk.com/wall-58509583_1347991" parse_post_links(post_url) print("Π‘ΡΡΠ»ΠΊΠΈ Π½Π° ΡΠΎΡΠΎΠ³ΡΠ°ΡΠΈΠΈ ΡΠΎΡ
ΡΠ°Π½Π΅Π½Ρ Π² ΡΠ°ΠΉΠ» 'photo_links.txt'.") Π€Π°ΠΉΠ» ΡΠΎΡ
ΡΠ°Π½ΡΠ΅ΡΡΡ, Π½ΠΎ ΡΠ°ΠΌ ΠΏΡΡΡΠΎ.
2024-10-26
ΠΡΠΊΠ»ΠΈΠΊΠ½ΡΡΡΡΡ