Applied AI Pod

NLP, Speech Tech, Transformer Models, w/ Marc von Wyl, Algolia, E30

Episode Summary

Join a conversation with Marc von Wyl, Sr ML Engineer @Algolia. Marc is teaching Natural Language Processing @EPITA, and is an experienced computer scientist specialized in Natural Language Processing, Machine Learning, and languages in general. Together, we dig into: unstructured data, improving error and ambiguity, and future of NLP.

Episode Notes

01:15 - How does NLP work?
04:05 - How do Transformer-based NLP models work?
08:20 - How to look at unstructured data to take advantage of it more.
12:00 - How to leverage ML to bring more to unstructured data?
15:25 - Approach for low resources languages.
23:25 - Word embeddings for common reasoning needs.
26:55 - Techniques to follow to improve error and ambiguity in training data or for a model in general.
30:10 - Are GPTs leading effort in the field in a wrong direction?
34:15 - Is DeepLearning the end of AI?
37:20 - What are some good NLP metrics to watch?
42:05 - How do we get past transactional queries to conversational queries?
52:00 - Is the Turing test still relevant for NLP or has it become obsolete?

References:

AI-Powered Search referenced in respect of text not being unstructured.
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Rethinking Search:Making Experts out of Dilettantes Common sense reasoning
TWIML AI podcast 518 with Yejin Choi
DARPA's Explainable AI Project
EPITA is an engineering school in Paris.
Marc's LinkedIn profile.