Data Prep Ultimately Allows for Prediction Model

Who is the client and what was their business problem?

The client is an artificial intelligence Fintech company. They were seeking an appropriate Data Wrangling (Data Preparation) methodology to integrate valuable data with their intelligent Financial Prediction Model for extracting investor behavior in the USA stock market, analyzing institutional investors’ interests.



What services did Data-Core initially provide? What was challenging about solving the client’s problem?

Data-Core’s main focus was to develop an intelligent Financial Predication Model for extracting investor behavior in the USA stock market, so as to increase the availability of decision support data and hence increase investor satisfaction. However, to achieve exceptional results ‘data preparation’ was an important aspect as the model was built based on the data to perform stock portfolio analysis.

  • Initially, data was pulled from the MySQL data tables and extracted to the Client Server periodically every 45 days.
  • The data was then loaded on a weekly basis from the Client Server to the DC Server using an ETL Scheduler.
  • The raw data loaded in the DC Server then undergoes a Data Cleanup and Data Wrangling process to prepare the Training Set and Test Set for the Machine Learning Data Input.
  • The prepared datasets (Training Set & Test Set) were then uploaded from DC Server to Client Server and the data was pulled from Client Server to the Azure ML Studio.

One of the challenges Data-Core faced and overcame, was to understand the characteristics of the data and to prepare the most suitable training data sets and test data sets such that the accuracy of the Financial Prediction Model would be greater than 65%. Another challenge Data-Core faced and conquered was automation scripts for data preparation.

To achieve exceptional results ‘data preparation’ was an important aspect

What stand-out points are there from these services?

Data-Core followed a very simple and systematic data wrangling process such that the data was meaningful and suitable as the training and test set for the ML data input.



Summary:

Data-Core’s simple data wrangling and data preparation process was an essential aspect for the successful implementation of the Financial Prediction Model project. The dashboard designs which claimed to be one of the most complicated Dundas BI dashboards in the industry, was a huge success and satisfaction for the clients. The previous projects were followed by data migration work from MySQL to Data Warehouse. The developed dashboards are currently under a maintenance contract.

Simple and systematic data wrangling process