Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Credential Engine launch #2415

Open
wants to merge 1,023 commits into
base: main
Choose a base branch
from
Open

[DO NOT MERGE] Credential Engine launch #2415

wants to merge 1,023 commits into from

Conversation

ChelseaKR
Copy link
Collaborator

No description provided.

@ChelseaKR ChelseaKR force-pushed the credential-engine branch from 196d379 to c609043 Compare May 10, 2024 22:25
@ChelseaKR ChelseaKR force-pushed the credential-engine branch from da6bc08 to efa049a Compare July 5, 2024 16:13
scwambach and others added 30 commits January 16, 2025 10:09
…ntains matches for learning opportunity profile ceterms:name an ceterms:description
…ssing data

- Simplified access to nested fields using optional chaining (?.).
- Added null coalescing (??) to provide default values for missing data.
- Ensured better handling of edge cases, such as missing or incomplete contact information.
- Improved code clarity and reduced redundancy.
…ory`

- Added `normalizeQueryParams` to standardize query parameters for consistent cache keys.
- Updated caching logic to dynamically use normalized parameters and reduce cache misses.
- Enhanced logging to distinguish cache hits and misses for better observability.
- Renamed functions for clarity: `filterCerts` -> `filterRecords`, `paginateCerts` -> `paginateRecords`.
- Preserved existing functionality for filtering, sorting, and pagination.

This refactor improves caching efficiency, reduces redundant API calls, and enhances code readability.
- Renamed `transformCertificateToTraining` to `transformLearningOpportunityCTDLToTrainingResult` for clarity.
- Updated variable naming for consistency (`allCerts` -> `learningOpportunities`).
- Added `tokenize` and `levenshteinDistance` functions to support text processing.
- Introduced `rankResults` to rank search results based on query relevance.
- Integrated ranking into `searchTrainingsFactory` to improve search accuracy.
- Included `description` in the `TrainingResult` transformation for better matching.

This update enhances search result quality by prioritizing relevant results.
- Added provider name (`ceterms:ownedBy`) to ranking to improve search results.
- Introduced `STOP_WORDS` set to prevent stripping essential provider words.
- Increased proper noun weighting in `rankResults` for better relevance.
- Adjusted tokenization to retain hyphens, numbers, and spaces.
- Boosted exact name matches significantly to prioritize precise results.
- Added fuzzy matching logic for minor spelling variations.
- Improved phrase matching by including description and provider name.

These updates ensure better recall and precision when searching for training opportunities.
- Boost exact provider name matches (+15,000) to prioritize direct matches
- Boost exact training program name matches (+2,000) for more relevant results
- Add location-based boosting (+1,500 per match) to prioritize local results
- Increase weight for multi-word phrase matches (+500) to improve precision
- Penalize unrelated results (-2,000) if neither provider nor training name matches

These changes ensure that searches like "Workforce Advantage - Elizabeth nursing" prioritize relevant programs from the correct provider and location.
…cified text should be hidden if provider does not give that info"
- Refactored `rankResults` function:
  - Introduced `COMMON_WORDS` set to filter out generic terms.
  - Improved token-based matching with weighted scores.
  - Enhanced fuzzy matching efficiency.
  - Set a minimum score threshold for relevance filtering.

- Improved query filtering:
  - Standardized tokenization for training names, descriptions, and providers.
  - Boosted ranking for strong matches and removed low-relevance results.

- Added additional logging for debugging and performance insights.

This refactor enhances search accuracy, reduces noise, and improves result ranking consistency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants