Artificial intelligence algorithms are now very advanced and more capable of dealing with the requirements of Big Data. Technologies such as Machine Learning or Deep Learning offer autonomous learning possibilities that can unravel and sometimes surpasses human capabilities. Analytics, can now be harnessed to allow companies to better identify new sources of revenue, to personalize customer relations and to anticipate customer behavior. Companies can now be proactive and base their decisions on facts rather than intuition
A data driven and unique approach to measuring advertising effectiveness that has been developed at Mars and rolled out globally. The methodology relies datasets that are available in various large markets. We used data fusion from use case ‘Data Fusion’ to extend the methodology to smaller markets.
The traditional approach to applying machine learning to couponing dataset is to predict who is going to redeem the coupon. Yet in a context where the coupon come at cost (giving money to customers to buy your product) this approach is not suitable. The right methodology should consider the expected impact of the coupon on the future purchases of the receiving customer. We validated this methodology using a Randomized Controlled experiment and found a ROI of 5% on the targeted population.
Randomised control trials
Being a data driven organisation imply the use of proven methods to validate marketing and sales hypotheses. Generally RCT methodologies and or carefully planned A/B testing are used. Yet these methodologies fail when the considered sample is small. Our methodology is using a repost optimisation technique to design the experiment by suggesting the optimal random split and by measuring the impact of the treatment after the experiment ends. We achieved a 60% noise reduction that allow to measure even small impacts.
Late client payment
In a B2B environment estimating client income is an important input for the planification of the cash-flow. Knowing in advance which client is going to pay before or at the deadline in a key input to this process. A dataset of 300 clients and their historical transactions was provided and analyzed. We predict both a binary outcome (who is going to miss) and an estimation of how late the client would be (expressed in number of days of delays). The system is already deployed in production and achieved 75% performance.
Identifying client solvency is important before doing business with them, specifically for private non listed companies. Traditional techniques rely on financial statements only. These usually come 6 months after the fiscal year and are usually late for small companies. The client assignment was to develop a model to predict the bankruptcy based only on textual information published in various information outlets (Google, Trends/Tendances, Twitter, Official Journal, annual report).
We developed tools that analyse unstructured text (reports, emails,…) and semi-structured text ( invoices, forms,…) and extract the available informations. Those informations can be exported as a table and used for more standard analysis. The example bellow shows a sample extraction from a key Information document for an investment product of a Bank. Bellow the document, the extracted table with the required information.
Healthcare ‘no shows’
Hospitals suffer from clients not showing on the appointment date. This translates to a loss of money and resources. As they try to manage it by overbooking the problem worsen as the clients waiting time increases and satisfaction decreases. Based on historical data, the goal was to predict who is going to miss the appointment and to suggest preventive actions.
Ticket sale prediction
The data consists of the pre-sale of tickets of 15 rugby matches of the French National team, spanning from the year 2009 to the year 2015. These matches comprise both training matches and matches belonging to the 6 Nations Tournament. This study on the forecast of pre-sales of Rugby ticket matches shows great potential for future forecasts. With the available data we were able to forecast with great accuracy several ticket sales, each of them with different behavior. By considering more variables and a more detail study of the data, the forecast accuracy of the pre-sale of each match can be made more accurate.
Customer facing services deal daily with a large amount of emails that need to be sorted, forwarded and finally and most importantly, the client request should be answered. We developed a solution that optimises several tasks:
- Email filtering, don’t forward emails that does not need follow ups and handling. For example, out of office replies, notifications, automatic systems emails, ads,…
- Dispatching emails according to taxonomies. We support 2 taxonomies by defaults. More can be configured.
- Determine the intent implied by the email.
- Suggest a reply to the emails.
- Auto reply to some emails.
The software is available as an API, which makes it easy to integrate with existing systems.
Customer feedback analysis
Analyzing a large data set of customer feedback is difficult and prone to bias and human error. Semantic technologies helps by automatically processing large amount of feedbacks, detecting topics and sentiments, discarding small and non-significants topics and finding various relationships between complaint related concepts. This allow to estimate the proportions of various topics and link them to the satisfaction or the sentiments of the customers.
This startup launched a viral application, that recommends a movie based on mood of the user by offering a gasified experience. In this experience user answers 3 questions (selected based on the user profile) by pressing a Yes/No button. These answers allow to determine the mood and suggest a movie accordingly. The recommendation is based on the semantic analysis of the questions and the movie plots, together with the user profile.
Solar energy predictions
In this project we had access to a large household based photovoltaic panel parc historical data of production. We also had access to historical weather prediction data generated by a university lab. The statistical model is based on machine learning algorithms and used approximately 6000 models that run simultaneously (one model per household installation). This models outperform rival models and also models that are based on physical modeling of the Photoelectric process.
We achieved an aggregated accuracy of more than 99%. This accuracy and the models that household specific models, allow rot accurately detect breakdowns and send notifications for maintenance. A predictive maintenance models is currently under consideration.