Для себя. Помощь в подготовке проекта в питоне по предмету Algorithms for Massive Data. Условия проекта (выбрать одну из трех тем): Project 1: Finding similar items The task is to implement a detector of pairs of similar items, analyzing one of the two datasets described here below. Bird species The «250 Bird Species» dataset is published on Kaggle. The detector must consider the images in the dataset and output the pairs inferred as similar. StackSample The «StackSample» dataset is published on Kaggle and released under the CC-BY-SA 3.0 license, with attribution required. The detector must consider the Body column of the Questions.csv file and output the pairs inferred as similar. Project 2: Market-basket analysis The task is to implement a system finding frequent itemsets (aka market-basket analysis), analyzing one of the two datasets described below. IMDB The «IMDB» dataset is published on Kaggle, under IMDb non-commercial licensing. The analysis must be done considering movies as baskets and actors as items. Old newspapers The «Old Newspapers» dataset is published on Kaggle and released under the public domain license (CC0). The analysis must be done considering values of the Text attribute as baskets and words as items. Project 3: Link analysis The task is to implement a system ranking nodes in a graph using the PageRank index (or other approaches based on link analysis), processing the IMDB dataset described within Project 2. In this case, nodes in the graph will identify actors, and an edge will link two nodes if the corresponding authors played at least once in the same movie. Project implementation Important: the techniques used in order to analyze data have to scale up to larger datasets. The project can be carried out individually, or in groups of two students. Code should be written in Python 3. The project should be made available through a public github repository, containing code and a report describing the work done. The dataset should not be added to the repository, but downloaded during code execution, for instance via the kaggle API (https://github.com/Kaggle/kaggle-api). Code should be implemented using a jupyter notebook executable on Google colab, possibly adding a badge/link directly from the repository to the colab version of the notebook. The project report, preferably written in LaTeX, will be evaluated according to the following criteria: correctness of the general methodological approach, replicability of the experiments, correctness of the approach, scalability of the proposed solution, clarity of exposition. The report should contain the following information: the chosen dataset, and the parts of the latter which have been considered, how data have been organized, the applied pre-processing techniques, the considered algorithms and their implementations, how the proposed solution scales up with data size, a description of the experiments, comments and discussion on the experimental results.
Для себя (Провести исследование в области machine learning , deep learning на языке python. Исследование связано с проверкой соответствия звёздного рейтинга на сайтах)
Как мне найти учеников по профилю python рядом с м. Александровский сад?
Зарегистрируйтесь и создайте привлекательный профиль с упоминанием вашей специализации. Обратите внимание на количество доступных заявок от учеников, которое на апрель 2026 года составляет 210
Какие требования к преподавателям на вашем сайте?
На нашем сайте приветствуются преподаватели с любым уровнем квалификации. Мы рекомендуем указать все свои сертификаты и образование, чтобы увеличить шансы на нахождение учеников
Могу ли я установить гибкий график работы?
Вы полностью контролируете свой график и можете обсуждать его напрямую с учениками, чтобы найти удобное время для обеих сторон
Каков потенциальный заработок для репетитора Python?
Заработок зависит от количества занятий и квалификации. В среднем, за одно занятие можно заработать 311.94 рублей с занятия. Больше занятий в неделю – выше доход