Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update PostgreSQL image to include pgvector extension #426

Merged
merged 7 commits into from
Jul 26, 2024

Conversation

mutaiib
Copy link
Contributor

@mutaiib mutaiib commented Jun 29, 2024

What

Update PostgreSQL image to include pgvector extension.

Why

The current image under docker/docker-compose-dev-essentials:servides:db does not come with the pgvector extension enabled, halting the process of using PostgreSQL as the vector database.

How

Replaced the postgres:15.6 image with pgvector/pgvector:pg15.

Can this PR break any existing features? If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No, since the image installs the pgvector and runs on the supported version of PostgreSQL.

Database Migrations

N/A

Env Config

N/A

Relevant Docs

pgvector GitHub

Related Issues or PRs

N/A

Dependencies Versions

N/A

Notes on Testing

N/A

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

- Updated pgimage to pgvector extension enabled image.
- Ensured compatibility with existing configurations.
@CLAassistant
Copy link

CLAassistant commented Jun 29, 2024

CLA assistant check
All committers have signed the CLA.

@hari-kuriakose hari-kuriakose added enhancement New feature or request good first issue Good for newcomers labels Jul 1, 2024
@hari-kuriakose
Copy link
Contributor

hari-kuriakose commented Jul 1, 2024

@mutaiib Thanks for the contribution!

This could definitely help when the user wants to use the same PostgreSQL instance both for Unstract backend metadata storage and as the Vector DB.

However, wanted to confirm one thing though.
After changing default Unstract db to pgvector provided PostgresSQL instance:

  • Did the app run successfully?
  • Were you able to add the same instance as a Vector DB too?

NOTE: Later we could use new pgvectorscale extension which is said to be more efficient and performant. It should become possible once LlamaIndex supports it.

@mutaibsha
Copy link

mutaibsha commented Jul 1, 2024

@mutaiib Thanks for the contribution!

This could definitely help when the user wants to use the same PostgreSQL instance both for Unstract backend metadata storage and as the Vector DB.

However, wanted to confirm one thing though. After changing default Unstract db to pgvector provided PostgresSQL instance:

  • Did the app run successfully?
  • Were you able to add the same instance as a Vector DB too?

NOTE: Later we could use new pgvectorscale extension which is said to be more efficient and performant. It should become possible once LlamaIndex supports it.

Yup, both. I was not only able to run the application, I ran and tested it using the postgres as the vector DB integration.

BTW, ill have a look into the pgvectorscale

@hari-kuriakose
Copy link
Contributor

@ritwik-g @gaya3-zipstack @chandrasekharan-zipstack

  • For OSS it will be pgvector by default, and for Enterprise it will be postgres by default. Is this difference handled in respective envs?
  • For existing OSS installations, will switching from current postgres to new pgvector automatically import existing data under postgres_data Docker volume without any compatibility issues?

@chandrasekharan-zipstack
Copy link
Contributor

@hari-kuriakose

  • For OSS it will be pgvector by default, and for Enterprise it will be postgres by default. Is this difference handled in respective envs?

Yes, this difference is handled and controlled by the image we use for different deployments

  • For existing OSS installations, will switching from current postgres to new pgvector automatically import existing data under postgres_data Docker volume without any compatibility issues?

I tried this and I didn't face any issues, however will let @gaya3-zipstack confirm since she explicitly did some testing around this.

@ritwik-g
Copy link
Contributor

@ritwik-g @gaya3-zipstack @chandrasekharan-zipstack

  • For OSS it will be pgvector by default, and for Enterprise it will be postgres by default. Is this difference handled in respective envs?
  • For existing OSS installations, will switching from current postgres to new pgvector automatically import existing data under postgres_data Docker volume without any compatibility issues?

Regarding the second point. Since we are storing data in volume. The image change shouldn't cause any issue assuming the pgvector image also uses the same path for data storage

@gaya3-zipstack
Copy link
Contributor

@hari-kuriakose

  • For OSS it will be pgvector by default, and for Enterprise it will be postgres by default. Is this difference handled in respective envs?

Yes, this difference is handled and controlled by the image we use for different deployments

  • For existing OSS installations, will switching from current postgres to new pgvector automatically import existing data under postgres_data Docker volume without any compatibility issues?

I tried this and I didn't face any issues, however will let @gaya3-zipstack confirm since she explicitly did some testing around this.

I also tested locally with existing data and could not see any issue.
I only saw a compatibility problem when a different version of pgvector is used.

Copy link
Contributor

@gaya3-zipstack gaya3-zipstack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine

Copy link

@gaya3-zipstack gaya3-zipstack merged commit 8c023f6 into Zipstack:main Jul 26, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants