Skip to content

Commit

Permalink
docs: create readme
Browse files Browse the repository at this point in the history
  • Loading branch information
developer239 committed Jul 27, 2024
1 parent 557cb72 commit 07c77a1
Showing 1 changed file with 95 additions and 0 deletions.
95 changes: 95 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# llama.cpp-ts 🦙

[![npm version](http://img.shields.io/npm/v/llama.cpp-ts.svg?style=flat)](https://www.npmjs.com/package/llama.cpp-ts "View this project on npm")

LlamaCPP-ts is a Node.js binding for the [LlamaCPP](https://github.com/developer239/llama-wrapped-cmake) library, which wraps the [llama.cpp](https://github.com/ggerganov/llama.cpp) framework. It provides an easy-to-use interface for running language models in Node.js applications, supporting both synchronous queries and asynchronous streaming responses.

**Supported Systems:**

- MacOS
- Windows (not tested yet)
- Linux (not tested yet)

## Models

You can find some models [here](https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main)

## Installation

Ensure that you have [CMake](https://cmake.org) installed on your system:

- On MacOS: `brew install cmake`
- On Windows: `choco install cmake`
- On Linux: `sudo apt-get install cmake`

Then, install the package:

```bash
npm install llama.cpp-ts
# or
yarn add llama.cpp-ts
```

## Usage

### Basic Usage

```javascript
import { Llama } from 'llama.cpp-ts';

const llama = new Llama();
const initialized = llama.initialize('./path/to/your/model.gguf');

if (initialized) {
const response: string = llama.runQuery("Tell me a joke about programming.", 100);
console.log(response);
} else {
console.error("Failed to initialize the model.");
}
```

### Streaming Responses

```javascript
import { Llama, TokenStream } from 'llama.cpp-ts';

async function main() {
const llama = new Llama();
const initialized: boolean = llama.initialize('./path/to/your/model.gguf');

if (initialized) {
const tokenStream: TokenStream = llama.runQueryStream("Explain quantum computing", 200);

while (true) {
const token: string | null = await tokenStream.read();
if (token === null) break;
process.stdout.write(token);
}
} else {
console.error("Failed to initialize the model.");
}
}

main().catch(console.error);
```

## API Reference

### Llama Class

The `Llama` class provides methods to interact with language models loaded through llama.cpp.

#### Public Methods

- `constructor()`: Creates a new Llama instance.
- `initialize(modelPath: string, contextSize?: number): boolean`: Initializes the model with the specified path and context size.
- `runQuery(prompt: string, maxTokens?: number): string`: Runs a query with the given prompt and returns the result as a string.
- `runQueryStream(prompt: string, maxTokens?: number): TokenStream`: Streams the response to the given prompt, returning a `TokenStream` object.

### TokenStream Class

The `TokenStream` class represents a stream of tokens generated by the language model.

#### Public Methods

- `read(): Promise<string | null>`: Reads the next token from the stream. Returns `null` when the stream is finished.

0 comments on commit 07c77a1

Please sign in to comment.