PYTHON - DATA PROCESSING

Photo of the SAE 105 integrative project in Python

Project Presentation

As part of my first year of the BUT Networks & Telecommunications at the IUT of Annecy, I completed an individual project in Python titled SAE 105: Data Processing. The objective was to manipulate, process, and present data from a CSV file containing more than 36,000 rows and 27 columns on French cities.

Project Objective

  • Develop a Python program that adheres to precise specifications.
  • Learn to read and process large CSV files.
  • Implement an interactive menu offering different features (statistics, distances between cities, mapping, etc.).
  • Produce graphical visualizations (histograms, interactive maps with Folium).
  • Get used to working independently with regular deliverables
Photo of the SAE 105 integrative project in Python

Project Process and Accomplishments

Skills Used

  • Python programming: loops, functions, list management, sorting (bubble sort), interactive menus.
  • Data processing: information extraction and cleaning, statistical calculations (mean, standard deviation, population growth).
  • Visualization: using matplotlib, folium, and branca libraries to plot maps and graphs.
  • Project management: respecting deadlines, working autonomously, delivering progressive results.
  • Analytical mindset: structuring a complex problem into clear steps.

Features Developed

  • Data extraction from the CSV file (cities, departments, populations, density, GPS coordinates, altitudes).

Statistics:

  • 5 most/least populated cities in a department.
  • 10 cities with the highest/lowest density.
  • Cities with the largest population increase or decline between 1999 and 2012.
  • Cartographic visualization: displaying cities on OpenStreetMap with circles proportional to their density/population.
  • Histogram: distribution of cities according to their number of inhabitants in 2010.
  • Distances: calculation of the Euclidean distance between two cities (e.g., Paris – Marseille).
  • Pathfinding algorithm: searching for a path between two cities using geographical proximity.

Tools Used

  • Python 3
  • Libraries: matplotlib, folium, branca
  • CSV file: villes_france.csv (≈ 36,700 rows)
  • Moodle: submission and tracking of deliverables

Difficulties Encountered

  • Manipulating a large file (processing optimization).
  • Implementing sorting and statistical calculation algorithms using only concepts seen in class.
  • Using new libraries like Folium to display interactive maps.
  • Time management with frequent submissions for each session.

Overall Project Conclusion

SAE 105 was a formative experience that allowed me to put my knowledge of Python programming and data processing into practice. I learned to manage a large file, extract useful information, and represent it through statistics, graphs, and interactive maps. This project also taught me to work independently, follow a set of specifications, and overcome technical difficulties. Ultimately, I was able to create a complete and structured program, which is a significant first experience in the field of data processing and analysis applied to networks.