LangChain is a framework for developing applications powered by large language models (LLMs). It provides open-source building blocks for development, LangSmith for monitoring and optimizing production chains, and LangServe for turning chains into deployable APIs.
Installation and setup
We start by installing langchain and premai-sdk. You can type the following command to install:
pip install premai langchain
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_community.chat_models import ChatPremAI
Setup PremAI client in LangChain
Once we imported our required modules, letโs setup our client. For now letโs assume that our project_id is 8. But make sure you use your project-id, otherwise it will throw error.
To use langchain with prem, you do not need to pass any model name or set any parameters with our chat-client. By default it will use the model name and parameters used in the LaunchPad.
If you change the model or any other parameters like temperature  or max_tokens while setting the client, it will override existing default configurations, that was used in LaunchPad.
import os
import getpass
if "PREMAI_API_KEY" not in os.environ:
    os.environ["PREMAI_API_KEY"] = getpass.getpass("PremAI API Key:")
chat = ChatPremAI(project_id=8)
Chat Completions
ChatPremAI supports two methods: invoke (which is the same as generate) and stream.
The first one will give us a static result. Whereas the second one will stream tokens one by one. Hereโs how you can generate chat-like completions.
human_message = HumanMessage(content="Who are you?")
chat.invoke([human_message])
system_message = SystemMessage(content="You are a friendly assistant.")
human_message = HumanMessage(content="Who are you?")
chat.invoke([system_message, human_message])
chat.invoke(
    [system_message, human_message],
    temperature = 0.7, max_tokens = 20, top_p = 0.95
)
If you are going to place system prompt here, then it will override your system prompt that was fixed while deploying the application from the platform.You can find all the optional parameters here. Any parameters other than these supported parameters will be automatically removed before calling the model. Native RAG Support with Prem Repositories
Prem Repositories which allows users to upload documents (.txt, .pdf etc) and connect those repositories to the LLMs. You can think Prem repositories as native RAG, where each repository can be considered as a vector database. You can connect multiple repositories. You can learn more about repositories here.
Repositories are also supported in langchain premai. Here is how you can do it.
query = "what is the diameter of individual Galaxy"
repository_ids = [1991, ]
repositories = dict(
    ids=repository_ids,
    similarity_threshold=0.3,
    limit=3
)
Please note: Similar like model_name when you invoke the argument repositories, then you are potentially overriding the repositories connected in the launchpad.
Now, we connect the repository with our chat object to invoke RAG based generations.
response = chat.invoke(query, max_tokens=100, repositories=repositories)
print(response.content)
print(json.dumps(response.response_metadata, indent=4))
The diameters of individual galaxies range from 80,000-150,000 light-years.
{
    "document_chunks": [
        {
            "repository_id": 1991,
            "document_id": 1307,
            "chunk_id": 173926,
            "document_name": "Kegy 202 Chapter 2",
            "similarity_score": 0.586126983165741,
            "content": "n thousands\n                                                                                                                                               of           light-years. The diameters of individual\n                                                                                                                                               galaxies range from 80,000-150,000 light\n                                                                                                                       "
        },
        {
            "repository_id": 1991,
            "document_id": 1307,
            "chunk_id": 173925,
            "document_name": "Kegy 202 Chapter 2",
            "similarity_score": 0.4815782308578491,
            "content": "                                                for development of galaxies. A galaxy contains\n                                                                                                                                               a large number of stars. Galaxies spread over\n                                                                                                                                               vast distances that are measured in thousands\n                                       "
        },
        {
            "repository_id": 1991,
            "document_id": 1307,
            "chunk_id": 173916,
            "document_name": "Kegy 202 Chapter 2",
            "similarity_score": 0.38112708926200867,
            "content": " was separated from the               from each other as the balloon expands.\n  solar surface. As the passing star moved away,             Similarly, the distance between the galaxies is\n  the material separated from the solar surface\n  continued to revolve around the sun and it\n  slowly condensed into planets. Sir James Jeans\n  and later Sir Harold Jeffrey supported thisnot to be republishedalso found to be increasing and thereby, the\n                                                             universe is"
        }
    ]
}
Ideally, you do not need to connect Repository IDs here to get Retrieval Augmented Generations. You can still get the same result if you have connected the repositories in prem platform.
Streaming
In this section, letโs see how we can stream tokens using langchain and PremAI. Hereโs how you do it.
import sys
for chunk in chat.stream("hello how are you"):
    sys.stdout.write(chunk.content)
    sys.stdout.flush()
import sys
for chunk in chat.stream(
    "hello how are you",
    system_prompt = "You are an helpful assistant", temperature = 0.7, max_tokens = 20
):
    sys.stdout.write(chunk.content)
    sys.stdout.flush()
Please note: As of now, RAG with streaming is not supported. However we still support it with our API. You can learn more about that here.
Embedding
In this section we are going to discuss how we can get access to different embedding model using PremEmbeddings with LangChain. Lets start by importing our modules and setting our API Key.
import os
import getpass
from langchain_community.embeddings import PremEmbeddings
if os.environ.get("PREMAI_API_KEY") is None:
    os.environ["PREMAI_API_KEY"] = getpass.getpass("PremAI API Key:")
text-embedding-3-large model for this example. .
model = "text-embedding-3-large"
embedder = PremEmbeddings(project_id=8, model=model)
query = "Hello, this is a test query"
query_result = embedder.embed_query(query)
# Let's print the first five elements of the query embedding vector
print(query_result[:5])
Setting model_name argument in mandatory for PremAIEmbeddings unlike chat.
documents = [
    "This is document1",
    "This is document2",
    "This is document3"
]
doc_result = embedder.embed_documents(documents)
# Similar to the previous result, let's print the first five element
# of the first document vector
print(doc_result[0][:5])
print(f"Dimension of embeddings: {len(query_result)}")
Result:
[-0.02129288576543331,
0.0008162345038726926,
-0.004556538071483374,
0.02918623760342598,
-0.02547479420900345]