Skip to content

ssukharev/Java-REST-API-Scrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Java-REST-API-Scrapper

The REST API Scrapper collects data from 3 different APIs and parses this data into json and csv format.

🎯 The task:

It was necessary to select at least three open, periodically updated REST APIs, for example, open sources of news, articles, weather observations, or APIs of various services such as social networks, online cinemas, etc. From the list of open APIs.

The next step was to develop an application that polls the API of data sources in several streams and saves new records to a CSV or JSON file (at the user's choice).

📋 Requirements:

  1. When launching the application, it is passed as an argument:

    • The number of allowed simultaneously active threads is n;

    • Time timeout t in seconds, which sets the interval between service polling iterations;

    • A list of names of services that will be surveyed;

    • File format for saving results.

  2. A separate stream is created for each data source, in which the API is polled, namely, the execution of an HTTP request to the selected endpoints returning the data. No more than n such threads can run simultaneously. If the number of polled services is more than n, then the threads for the remaining services wait in line until one of the other threads completes updating the data. In this case, the polling thread starts over again after a set time t after completion and also waits for the queue if the number of active polling threads is n.

  3. All threads write data to the same file.

  4. To create streams, use the util.concurrent library.

  5. The code must be covered by unit tests by at least 70%.

  6. To connect third-party libraries and frameworks during development, use the Maven or Gradle project builder.

🧑🏻‍💻 Screenshots of work examples:

Input string Output in logger Output in file
2 10 currentsapi,newsapi,openweathermap JSON Снимок экрана 2025-08-15 в 02 52 53 Снимок экрана 2025-08-15 в 02 54 57
2 10 currentsapi,newsapi,openweathermap CSV Снимок экрана 2025-08-15 в 02 59 53 Снимок экрана 2025-08-15 в 03 00 37 P.S. The data is the same here, since I have limited the list of topics for which I receive news (the weather data is different), this can be done in this file, where you insert your API key.
2 10 fooapi,newsapi,pooapi CSV Снимок экрана 2025-08-15 в 02 39 32 Снимок экрана 2025-08-15 в 02 45 34
2 10 newsapi JSON Снимок экрана 2025-08-15 в 03 07 54 Снимок экрана 2025-08-15 в 03 08 36

etc.

Test result:

Снимок экрана 2025-08-15 в 03 37 25

Contribute

I welcome any contribution to the development of my repositories! Read our Contribution Guide to get started.

List of topic sources used in the project

  1. About the REST API

  2. About the JSON format

  3. The Jackson library for working with JSON

  4. Apache HttpClient HTTP client, which can be used to generate requests to endpoints.

About

The REST API Scrapper collects data from 3 different APIs and parses this data into json and csv format.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors