Matt Cornillon: How I found my Pokémon cards thanks to Postgres: an AI journey (PGConf.EU 2023)

Explore the journey of using Postgres and a machine learning model to detect Pokémon cards from images, with a demo of the real-life application and discussion on vector storage and manipulation.

Key takeaways
  • The speaker used a machine learning model to detect Pokémon cards from images and stored the embeddings in Postgres using the PG Vector extension.
  • The embeddings were generated using a convolutional neural network (CNN) and cosine distance was used for similarity search.
  • The speaker mentioned that the number of dimensions in the generated embedding is determined by the machine learning model, which in this case is 768 dimensions.
  • The PG Vector extension allows for storing and manipulating vectors, including inserting, deleting, and updating data.
  • The speaker also mentioned that the machine learning model can be improved continuously by reusing the pictures and filling the model with new data.
  • The speaker used Hugging Face to generate the embedding and also mentioned that it’s a company offering open-source machine learning models and data sets.
  • The speaker used label studio to create a data set for machine learning and mentioned that it’s a tool that enables creating a data set from pictures in a good format.
  • The speaker also mentioned that the PG Vector extension offers three different distance methodologies: cosine similarity, L2 distance, and IVF flat.
  • The speaker showed a demo of the application and mentioned that it’s a real-life use case and not just a theoretical example.
  • The speaker mentioned that the similarity search using PG Vector is exact nearest neighbor search if no indices are used, but approximate nearest neighbor search if indices are used.
  • The speaker also mentioned that storing pixels inside Postgres might not be efficient and that the embeddings are a better way to store the data.