We can't find the internet
Attempting to reconnect
Something went wrong!
Hang in there while we get back on track
The more data, the better the AI, isn’t it? | Michael Kieweg
Explore the nuanced relationship between data and AI performance, highlighting the importance of data quality, context, and human oversight in achieving accurate information extraction and classification.
- The idea that “more data, the better the AI” is oversimplified, as data quality and context are crucial factors in AI performance.
- Leverton’s company focuses on real estate contracts and legal documents, which require complex information extraction and classification.
- Optical character recognition (OCR) is necessary for converting images to searchable text, but it can be challenging, especially for documents with complex layouts or handwritten text.
- The AI model is influenced by the training data, which must be carefully curated and annotated to ensure accuracy.
- Human reviewers are necessary to correct and validate the machine’s output, which can be time-consuming and labor-intensive.
- A two-step review process can improve accuracy and reduce errors.
- The importance of post-processing in OCR is highlighted, as it can significantly improve the quality of the extracted text.
- Leverton’s software uses deep learning technologies to automatically extract information from documents, but human expertise is still required for data cleansing and annotation.
- The company has a large team of technical consultants who work closely with customers to set up and refine the data model and AI.
- The AI system uses information from different documents to identify patterns and relationships, but it can be challenging to extract relevant information from unstructured texts.
- Leverton’s software is used by over 100 customers who require high-quality data and reliable information extraction.
- The company prioritizes data security and transparency, ensuring that customers have control over their data.
- The importance of proper naming and descriptions in the data model is emphasized, as it can significantly impact the accuracy of the extracted information.
- The AI system uses a combination of machine learning and human expertise to extract and classify data points, which can be challenging for complex documents with multiple sections and conflicting information.