-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple Dashboard Launcher UI (To launch old dashboards) #1147
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: joshreini1 <[email protected]>
* bump versions in quickstarts * bump version * remove openai references in function definitions page * gemini example * headers * second example: semantic evals * updates, add rag triad * update top header
…truera#696) * add aliases for selectors for main method args and main method return * break down * refine --------- Co-authored-by: Josh Reini <[email protected]>
* exposed AzureOpenAI provider * added docs * Update CONTRIBUTING.md * typo in mkdocs.yml --------- Co-authored-by: Josh Reini <[email protected]>
* first * typos * typehint
* import llama only if needed * use optional imports instead --------- Co-authored-by: Piotr Mardziel <[email protected]>
* fix * typo * don't print external if internal is available
Co-authored-by: Piotr Mardziel <[email protected]>
* adjust docstring for select_context * langchain select_context, update quickstarts * undo app name change * remove dev cell * generalized langchain select_context (truera#711) * generalized langchain select_context * typo * typo in string * update langchain example to pass app in select_context --------- Co-authored-by: Josh Reini <[email protected]> * comments, clarity updates to quickstarts * add lib-independent select_context * update lc li quickstarts --------- Co-authored-by: Piotr Mardziel <[email protected]>
* add optional * bug class_info fix
* update configs * bugfix * dont add class info to dicts
* Fix correctness prompt Fixes truera#718 * Update base.py
* Bump suggested notebook versions * Combine notebooks and py files --------- Co-authored-by: Shayak Sen <[email protected]>
* Bump suggested notebook versions * Combine notebooks and py files * Update __init__.py --------- Co-authored-by: Shayak Sen <[email protected]>
* debug * display python version * python version * PromptTemplate update import * bad escape fix * add msg to exception * pass kwargs in Groundedness * pass kwargs with GroundTruthAgreement * give default value to ground_truth_imp * migrate db on reset
* fix example notebook * fixes * remove commented out
* always use prompt instead of messages * use messages in base * use prompt in bedrock * move score to top of cot template, request entire template be used * remove dev * add TODO
* update langchain instrumentation page * include instrumented methods * llama-index instrumentation updates * update the overview * change path to instrumentation overview * add some more info in appendices and line space --------- Co-authored-by: Piotr Mardziel <[email protected]>
* add instructions and text wrapping * format * clean up github scripts and update README sources * typo --------- Co-authored-by: Josh Reini <[email protected]>
Co-authored-by: joshreini1 <[email protected]>
* fix * remove redundant --------- Co-authored-by: Josh Reini <[email protected]>
* adjusted * fix typo
Co-authored-by: joshreini1 <[email protected]>
* add instructions and text wrapping * format * debugging * making obj arg no longer required * remove obj and add documentation for WithClassInfo * remove IPython from most notebooks and organize imports * fix test errors * forgot warning --------- Co-authored-by: Josh Reini <[email protected]>
* update notebooks to test * rehack * update langchain requirement * add core lowerbound
* fix rag triad and awaitable calls * remove locals printout in awaitables message * update __getattr__ in select_context (truera#1119) --------- Co-authored-by: Piotr Mardziel <[email protected]>
Co-authored-by: Josh Reini <[email protected]>
* Update feedback.py * use name
retreivers -> retrievers
* unify groundedness start * remove groundedness.py * groundedness nli moves * remove custom aggregator * groundedness aggregator to user code * move agg to trulens side by default (groundedness) * remove extra code * remove hf key setting * remove hf import * add comment about aggregation for context relevance * update init * remove unneeded import * use generate_score_and_reasons for groundedness internally * f-strings for groundedness prompts * docstring * docstrings formatting * groundedness reasons template * remove redundant prompt * update quickstarts * llama-index notebooks * rag triad helper update * oai assistant nb * update readme * models notebooks updates * iterate nbs * mongo, pinecone nbs * update huggingface docstring * remove outdated docstring selector notes * more docstring cleaning
* open ai streaming adjustments in cost tracking * notes * delete outputs
Co-authored-by: joshreini1 <[email protected]> Co-authored-by: Josh Reini <[email protected]>
* Update selecting_components.md * Update MultiQueryRetrievalLangchain.ipynb * Update random_evaluation.ipynb * Update canopy_quickstart.ipynb
Co-authored-by: joshreini1 <[email protected]>
* update comprehensiveness + nb * nb expansion * fix typo * meetingbank transcript data * oss models in app * test * benchmarking gpt-3.5-turbo, gpt-4-turbo, and gpt-4o * update path * comprehensiveness benchmark * updated summarization_eval nb * fix normalization * show improvement in comprehensiveness feedback functions --------- Co-authored-by: Daniel <[email protected]>
* version bump * simpler lc quickstart * update installs and imports * update langchain instrumentation docs * remove groundedness ref from providers.md * build docs fixes * remove key cell * fix docs build * firx formatting for stock.md * remove extra spaces * undo format change * update docstrings for hugs and base provider * openai docstring updates * hugs docstring update * update context relevance hugs docstring * more docstring updates * remove can be changed messages from openai provider docstrings
Co-authored-by: joshreini1 <[email protected]>
* add to glossary * finish some terms
We have been internally using this tool since we need to open multiple dashboards side by side and compare. Thought this might be a common need for all trulens users. So sharing it here. I know this might not be very attractive UI but the need is true. |
@Nanthagopal-Eswaran would love to understand the need more here. Why do you need to compare multiple dashboards rather than logging the different apps to the same sqlite db and thus comparing the apps in the same dashboard? |
Hi @joshreini1, There are two main reasons, Evaluating the changes in different versions of app Sharing the report / Re-open the report |
Thanks @Nanthagopal-Eswaran - I hope you don't mind if I drill down a bit more :) Evaluating the changes in different versions of appI get that we can use the same db and add. But this is an ideal case right. What if we want to compare the tests executed by different engineers / teams or if we want to automate these tests through github actions or azdo pipelines and share the db alone through mail. Comparing tests executed by different engineers/teams/automation would be better supported by using a shared database(s) to store the results than tracking a bunch of different sqlite dbs. Adding a shared database for TruLens to log to only requires passing a database URL compatible with SQLAlchemy (docs). Does this seem reasonable? Sharing the report / Re-open the report This use case seems reasonable, however I'm not sure it makes sense to support it directly in the package; it might be better as an internal tool. I would suggest that hosting the dashboard might be easier here (for you and the stakeholder), via an ec2 or similar. |
@joshreini1, Thanks for your insights. For first point, I get your point. We have to try how practical it is though. I am more worried about the data getting lost / corrupted by mistake and we have to spend huge cost to regenerate all the previous reports as they are in single location now. FYR, it takes more than $40 to execute one test run in our case. And for the second point, yes it would be good to have this as a separate tool. Feel free to skip this PR. But please have an internal discussion and add this as a separate repo if really needed. But this conversation actually made me realize the main problem here. I quickly went through streamlit repo and found this issue - This clearly shows the important of standalone html reports - streamlit/streamlit#611 |
Thanks @Nanthagopal-Eswaran - definitely understand and agree with your points on the importance of standalone reports. I'll continue to discuss with the team and get back to you once we've got a plan here. BTW - one additional workaround might be to use |
Items to add to release announcement:
It provides a simple UI, where user can select sqlite files from previous runs and launch the TruLens Dashboard. In this way, users don't have to worry about logging the results for later and just use the trulens provided stremlit UI any time.
🕹️ Usage
Run the following command:
Features
Simple UI to quickly select different sqlite files and launch trulens dashboards.
Multiple dashboards can be viewed by giving different port numbers.
Open Multiple Dashboards
Currently, the tool can open only one dashboard at a time. To open multiple dashboards, quick work around is to launch this tool multiple times.
Other details that are good to know but need not be announced:
🌱 Improvements [For Future]