Thierno Ibrahima Diop
Verified Expert in Engineering
Data Scientist and Developer
Thierno是一位首席数据科学家,对自然语言处理(NLP)和机器学习(ML)充满热情。. 他已经指导数据科学家学徒三年了. 他之前在网络和移动应用程序开发方面做了三年的自由职业者. Thierno is co-founder of GalsenAI, an artificial intelligence (AI) community in Senegal, a Coursera instructor on data science, and a Google developer expert in ML.
Portfolio
Experience
Availability
Preferred Environment
Jupyter Notebook, Visual Studio Code (VS Code), TensorFlow, PyTorch, Scikit-learn, Keras, Flask, SpaCy, Gensim, OpenAI
The most amazing...
...我开发的模型是一个检测代码中不同安全问题的系统. 它是使用大型语言模型构建的,例如GPT和LLaMA.
Work Experience
CEO | Lead Data Scientist
NuurAI
- 领导机器学习工程师团队,应用深度学习从音频输入中检测受欢迎的背诵者.
- 指导机器学习工程师应用深度学习来计算用户与背诵者的相似性.
- 帮助团队实现深度学习技术,并用我们的用例进行实验.
Senior Interview Engineer
Karat
- 在不到一年的时间里,完成了400多次面试,升入大四.
- 在与客户分享结果之前,负责其他面试官的质量控制.
- Gave live reviews for the onboarding of new interviewers.
AI Developer via Toptal
Desert Moon Speech Services LLC
- Collected data to convert audio to phonemes. 然后对数据进行处理,以处理噪声、持续时间和国际音标转换.
- 使用迁移学习在音素层面训练了一个简单的分类模型.
- 将问题转换为语音识别,以获得更多上下文和更多可用数据.
- Handed label imbalance as some phonemes are rare.
NLP Research Engineer
FLock.io
- Tested different prompt techniques (zero-shot learning, few-shot learning, chain-of-thought, 与不同的法学硕士就20多个安全问题进行了讨论.
- 优化llm以解决复杂的安全问题,并为模型准备数据.
- 创建管道以处理具有中间表示的代码并评估llm.
- 使用来自llm的嵌入,使用GMM和LDA进行主题建模.
- 使用LLM生成代码,通过创建代理对不同的安全问题进行模糊测试.
- Built the API and created the releases used in production.
- Multithreaded to accelerate prediction and inference time.
Lead Data Scientist
Baamtu
- Created a text-to-speech program with the Wolof language. 使用Wolof语言将文本转换为音素的算法与两个参与者协调数据收集,并评估音素覆盖率.
- 对沃洛夫语的自动语音识别做出了贡献. 设计了一个收集原始Wolof音频的平台,用于自我监督学习.
- 建立光学字符识别(OCR)和计算机视觉模型,从国民身份证中提取结构化数据. 内部部署模型和AWS Lambda功能以实现可伸缩性. Built a rotation model to handle the image rotation.
Data Scientist
Baamtu
- Used NLP and NLU to extract useful information in a legal text. Developed a regex tester library.
- 为一家电信公司开发了一个抽取式聊天机器人,用于自动FAQ,通过抓取网站和Twitter来收集数据.
- Performed data collection and annotation. Deployed using AWS Lambda.
- 利用Spark开发了一个规则系统,利用Apache Airflow实现了一个灵活的计分系统,具有作业管理和计分系统调度功能.
- 使用来自多个来源的数据在电信领域执行客户细分. 将聚类模型与理论指标和业务指标进行比较.
Developer
Freelance
- 作为全栈web和移动开发人员,同时为多个客户工作.
- 参与了prodispo移动和web应用程序的构思和实现.
- Developed a web application for the purchase of phone credit.
- 使用WebSocket创建并使用WebChat应用程序.
- 为Gainde 2000会议的非物质化开发REST api, 以通关管理为核心的塞内加尔海关战略平台.
- Created a web app for various football competitions.
- 构建了一个web服务和一个社交跨平台移动应用.
- Developed and orchestrated a news website using WordPress.
Experience
Automatic Speech Recognition for the Wolof Language.
This project was challenging due to the scarcity of data, so multiple techniques and tricks were used to make it work.
Wolof Speech Recognition
Chatbot for Customer Support in Telecommunication
使用多个相似度量对多个文本特征提取和模型进行了测试和比较.
Education
Master's Degree in Computer Science
Ecole Superieur Polytechnique de Dakar - Dakar, Senegal
Bachelor's Degree in Computer Science
Ecole Superieur Polytechnique de Dakar - Dakar, Senegal
Certifications
Cloudera CCA 175 Spark and Hadoop Developer
Cloudera
Skills
Libraries/APIs
TensorFlow, Scikit-learn, Keras, Pandas, Matplotlib, PyTorch, SpaCy, React, NumPy, SciPy, DeepSpeech
Tools
Gensim, Apache Airflow, Amazon Textract, Amazon SageMaker, ChatGPT, Kaldi, Git, Seaborn, TensorBoard, Whisper
Frameworks
Flask, Spark, Streamlit, Symfony, Angular, Ionic, Scrapy
Languages
Python 3, Python, Bash Script, SQL, PHP, Java, R
Platforms
Jupyter Notebook、Amazon EC2、Amazon Web Services (AWS)、AWS Lambda、Docker
Storage
Amazon S3 (AWS S3), PostgreSQL, Amazon DynamoDB, Databases
Paradigms
Fuzz Testing
Other
Natural Language Processing (NLP), Audio, Artificial Intelligence (AI), Machine Learning, Neural Networks, Hiring, Code Review, Source Code Review, Interviewing, Programming, Chatbots, BERT, Sentiment Analysis, Language Models, GPT, Generative Pre-trained Transformers (GPT), Team Management, DVC, OCR, Deep Learning, Artificial Neural Networks (ANN), APIs, Speech Recognition, OpenAI, Semantic Web, Topic Modeling, Clustering, Text Classification, OpenAI GPT-4 API, OpenAI GPT-3 API, Speech to Text, Transfer Learning, FastAPI
How to Work with Toptal
在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring