A long long time ago, I had one dream in my little boy's mind : to make computers able to understand the human language. I was 9, and I learnt Basic and assembly language, to program on a 6809 processor. Later, I started a PhD Thesis in Computational Linguistics. I worked on symbolic-oriented approaches, at the time, I was contemptuous of statistical approaches to language understanding. During my first postdoc, however, I started playing around with machine learning approaches, digging deep into the Latent Semantic Analysis approach.
At this point, I could not return to symbolic approaches. After a couple of years, I made a major discovery. I must admit that I discovered the basics of the NCISC algorithm mostly by chance. And luckily, this discovery happened precisely after I left my last postdoc position at the CEA, with Gregory Grefenstette, the best advisor of my short career in academics.
I joined a group of my friends for an "incubation" period, and 3 years after, we created eXenSa to work on the e-commerce recommendation business. Unfortunately our first product didn't catch up. It seems that having the best technology is not always decisive in business, you also have to learn how to sell.
In 2014, we finished the first version of eXenGine, our analysis engine, usable for text mining, graph analysis and behavioral prediction.
That's how the story begins. I'm sure you will be want to help us write the rest of it.
Who we are
Our company eXenSa has multiple facets :
- We do high level consulting for tech companies needing help in machine learning and data science,
- We have invented, developed and implemented NCISC, a data characterization algorithms for unsupervised and semi-supervised learning that can be used directly for nearest neighbours searchs, KNN classification or as a preprocessing step to improve and speed up other machine learning methods.
- We sell licences and customizations of eXenGine, our data processing engine that implements NCISC. A demo of eXenGine processing of the English Wikipedia (with complete vocabulary) is available here : wikinsights.org
- We are in the process of making eXenGine available as a Service on datagist.io
- We do high profile training in GPGPU and Distributed computing (Apache Spark)
You can have some more details about the science behind the products by looking at Guillaume Pitel's blog. Guillaume is the founder and principal scientist of eXenSa.More Info
Fast and Scalable
The core algorithm created by eXenSa and Guillaume Pitel is extremely simple, has a complexity almost equivalent to a simple multiplication of a dense matrix by a sparse matrix, which is extremely scalable (the sparse matrix can be split in blocks which can be distributed).
Despite its simplicity, NCISC produces very high quality representations, at the same level and generally better than Word2Vec, ALS, Stochastic Gradient Descent, Autoencoders, LDA, in a dense and very compact representation which can be efficiently used with a linear classifier, clustering algorithm, or neighbour search.
External knowledge injection
NCISC has the ability to be fed with additional information that allow it to perform even better for tasks like classification. By combining unsupervised and semisupervised learning, we are often able to reduce classification error by 20 to 50%. For more information, look at these posts on Guillaume Pitel's blog : here and here.
Morgane is a young PhD and Computer Scientist. She is specialized in Natural Language Processing…
Yoann is a Machine Learning expert, he brings expertise and advices to our team.
Laurent is one of the first associate of eXenSa. He brings his expertise in the…
Geoffroy is the first employee of eXenSa. He is a Computer Scientist (from the EPITA school)…