docs: create readme

developer239 · Jul 27, 2024 · 07c77a1 · 07c77a1
1 parent 557cb72
commit 07c77a1
Showing 1 changed file with 95 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,95 @@
+# llama.cpp-ts 🦙
+
+[![npm version](http://img.shields.io/npm/v/llama.cpp-ts.svg?style=flat)](https://www.npmjs.com/package/llama.cpp-ts "View this project on npm")
+
+LlamaCPP-ts is a Node.js binding for the [LlamaCPP](https://github.com/developer239/llama-wrapped-cmake) library, which wraps the [llama.cpp](https://github.com/ggerganov/llama.cpp) framework. It provides an easy-to-use interface for running language models in Node.js applications, supporting both synchronous queries and asynchronous streaming responses.
+
+**Supported Systems:**
+
+- MacOS
+- Windows (not tested yet)
+- Linux (not tested yet)
+
+## Models
+
+You can find some models [here](https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main)
+
+## Installation
+
+Ensure that you have [CMake](https://cmake.org) installed on your system:
+
+- On MacOS: `brew install cmake`
+- On Windows: `choco install cmake`
+- On Linux: `sudo apt-get install cmake`
+
+Then, install the package:
+
+```bash
+npm install llama.cpp-ts
+# or
+yarn add llama.cpp-ts
+```
+
+## Usage
+
+### Basic Usage
+
+```javascript
+import { Llama } from 'llama.cpp-ts';
+
+const llama = new Llama();
+const initialized = llama.initialize('./path/to/your/model.gguf');
+
+if (initialized) {
+  const response: string = llama.runQuery("Tell me a joke about programming.", 100);
+  console.log(response);
+} else {
+  console.error("Failed to initialize the model.");
+}
+```
+
+### Streaming Responses
+
+```javascript
+import { Llama, TokenStream } from 'llama.cpp-ts';
+
+async function main() {
+  const llama = new Llama();
+  const initialized: boolean = llama.initialize('./path/to/your/model.gguf');
+
+  if (initialized) {
+    const tokenStream: TokenStream = llama.runQueryStream("Explain quantum computing", 200);
+
+    while (true) {
+      const token: string | null = await tokenStream.read();
+      if (token === null) break;
+      process.stdout.write(token);
+    }
+  } else {
+    console.error("Failed to initialize the model.");
+  }
+}
+
+main().catch(console.error);
+```
+
+## API Reference
+
+### Llama Class
+
+The `Llama` class provides methods to interact with language models loaded through llama.cpp.
+
+#### Public Methods
+
+- `constructor()`: Creates a new Llama instance.
+- `initialize(modelPath: string, contextSize?: number): boolean`: Initializes the model with the specified path and context size.
+- `runQuery(prompt: string, maxTokens?: number): string`: Runs a query with the given prompt and returns the result as a string.
+- `runQueryStream(prompt: string, maxTokens?: number): TokenStream`: Streams the response to the given prompt, returning a `TokenStream` object.
+
+### TokenStream Class
+
+The `TokenStream` class represents a stream of tokens generated by the language model.
+
+#### Public Methods
+
+- `read(): Promise<string | null>`: Reads the next token from the stream. Returns `null` when the stream is finished.