-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give posts (any type) higher priority in link search results #63683
Comments
Take this too literally and we'll cause a regression of #56478 😀 I think we want to prioritise posts and pages but not always place them above every other result. It's important that users can easily link to tags and categories especially from the Navigation block. |
I'd say you are likely to search for and link a page many times more likely than a tag or category archive. I would say given a page, tag, and attachment matched with Perhaps deprioritizing attachments would meet the expectations better? |
Yeah in your example screenshot I'd expect Composing with patterns to be first but if you searched for "patterns-1" I'd expect patterns-1 to appear first. The key thing to bear in mind is that we don't want to regress #56478 as that bug made creating some types of navigation basically impossible. I think giving posts a slight (25%? need to play with the exact number) boost and attachments a slight penalty (25%?) should work. |
I'm down for trying that. |
After updating to WordPress 6.7 this is something we've heard several client teams complain about. They are having a much harder time finding the actual relevant content than they did before the update which is impacting their workflows. In all honesty I would love an option to remove attachments from the results altogether. On most sites that simply isn't something that editors need to do. And if so they can add the link manually 🤔 The exploration of giving less prominence to attachments sounds like a good first step though 👍 I'm also going to add the |
Hey @noisysocks, I explored implementing the 25% boost for post types and the 25% penalty for attachments. Here are my findings: Initially, I directly applied the adjustments like this: // Boost for post types, penalty for attachments
if (result.kind === 'post-type') {
relevanceScore *= 1.25;
} else if (result.kind === 'media') {
relevanceScore *= 0.75;
} However, I noticed that the recently added sorting logic was penalizing results with longer titles. This was due to the score calculation formula: (exactMatchingTokens.length / titleTokens.length) * 10;
const subMatchScore = subMatchingTokens.length / titleTokens.length; To address this, I modified the logic to depend on the length of the search query instead of the title: const exactMatchScore =
(exactMatchingTokens.length / searchTokens.length) * 10;
const subMatchScore =
subMatchingTokens.length / searchTokens.length; This worked better, but exact string matches were still being ranked lower than post types. To resolve this, I added a significant boost for exact title matches to ensure they appear at the top: // Significant boost for exact title matches
if (result.title.toLowerCase() === search.toLowerCase()) {
relevanceScore *= 100;
} Currently, the ranking logic is functioning as follows (I’ll share a video to demonstrate this). I’ve also retested the previously implemented fixes to ensure they aren’t breaking anything, and everything appears to be working as expected. Do you think this approach is good to proceed with, or would you suggest any additional changes? If everything looks good, I’ll raise a PR with these updates. Thanks! Complete code:export function sortResults( results: SearchResult[], search: string ) {
const searchTokens = tokenize( search );
const scores = {};
for ( const result of results ) {
if ( result.title ) {
const titleTokens = tokenize( result.title );
const exactMatchingTokens = titleTokens.filter( ( titleToken ) =>
searchTokens.some(
( searchToken ) => titleToken === searchToken
)
);
const subMatchingTokens = titleTokens.filter( ( titleToken ) =>
searchTokens.some(
( searchToken ) =>
titleToken !== searchToken &&
titleToken.includes( searchToken )
)
);
// The score is a combination of exact matches and sub-matches.
// More weight is given to exact matches, as they are more relevant (e.g. "cat" vs "caterpillar").
// Diving by the total number of tokens in the title normalizes the score and skews
// the results towards shorter titles.
const exactMatchScore =
( exactMatchingTokens.length / searchTokens.length ) * 10;
const subMatchScore =
subMatchingTokens.length / searchTokens.length;
scores[ result.id ] = exactMatchScore + subMatchScore;
let relevanceScore = exactMatchScore + subMatchScore;
// Boost for post types, penalty for attachments
if ( result.kind === 'post-type' ) {
relevanceScore *= 1.25;
} else if ( result.kind === 'media' ) {
relevanceScore *= 0.75;
}
// Significant boost for exact title matches
if ( result.title.toLowerCase() === search.toLowerCase() ) {
relevanceScore *= 100;
}
scores[ result.id ] = relevanceScore;
} else {
scores[ result.id ] = 0;
}
}
return results.sort( ( a, b ) => scores[ b.id ] - scores[ a.id ] );
} PreviewCurrent Implementation:Screen.Recording.2024-12-09.at.7.50.25.PM.movTested whether the current changes break the previously added fix in #67367Screen.Recording.2024-12-09.at.7.51.50.PM.mov |
There's #67563 which has a working prioritisation of Posts. I would love for some reviews on that and/or code contributions to tweak this towards what we need. |
Hi @getdave, I tested the solution in #67563, and it seems to work well for me overall. Initially, I thought it might cause the regression mentioned in #56478, but after further testing, I was unable to reproduce the issue. Specifically, I created multiple posts using the following commands:
Additionally created 1 category and 1 attachment with name "Adventure" In my tests, I believe the sorting behavior appears to work correctly. 2024-12-10.21-15-04.mp4@noisysocks, could you confirm if the test cases for this scenario accurately validate the issue described in #56478? I’d appreciate your thoughts on this, @getdave. |
I'm beginning to think that an approach that relies solely on weighting may not be able to fully solve this problem. Search results are always limited to a maximum of 20 results, but what users want to prioritize can vary infinitely depending on user preferences and site content. Maybe the UI itself needs some improvements, like the following: Please excuse the clumsy design 😅 Add a button to load more search results:Allow search results to be filtered by type:@WordPress/gutenberg-design Any ideas? |
A dropdown to let you filter by type could be useful (I could see a filter dropdown live inside the input). The only hesitancy there is that this doesn't solve the main issue at hand, which is that the default search should either emphasize things that are not attachments, or de-emphasize attachments. We might even omit attachments as suggestions entirely, IMO the main flow for linking such is to use the media library. |
To provide everyone with context, attachments were added because there are users who want to link to documents (e.g. PDFs). It's quite common. I would support @t-hamano's proposal in conjunction with improving the weighting. What I would say in terms of design is that I remember this being explored previously and we quickly realised we'd need additional tabs other than Thanks for the dialogue here. Great to see 👍 |
I agree that filtering would be a useful enhancement, and can theoretically be used to solve this issue. A simple way to start might be add two tabs;
|
In LinkControl you can search for content existing on your site. This is great, but I did find that attachments were surfaced higher in search results than posts matching the search requirements.
I propose that pages and posts of all type are prioritized in the search results, above all others. It's much more likely to link to pages and posts, than to attachments.
Visual
The text was updated successfully, but these errors were encountered: