Jump to content

Alphabetization / sorting of punctuation marks


Recommended Posts

nospotify
Posted (edited)

I was puzzled by the alphabetization of Artists in my Music library and then realized it's because Emby treats punctuation marks as prior to letters in sorting. As with ignoring "The" in Artist/Title names, shouldn't Emby ignore leading punctuation marks in Title/Name? And the sorting of numbers in a Name/Title is entirely puzzling.

In the example image below:

1. [¡Cubanismo!] - this is the visibly obvious example - the leading Spanish upside down exclamation point should be ignored for alphabetization purposes, so it sorts as beginning with "C".

2. [Ali-Naqi] in the deeper metadata is ['Ali-Naqi], which is why it appears up front, but the leading single quote should be ignored for alphabetization purposes.

3. [Stick McGee] in the retrieved-from-MusicBrainz metadata is ["Stick" McGee], but the quotes around the first name should be ignored for alphabetization purposes.

4. And why would alphabetization/sorting go from [10,000 Maniacs] to [29th Street] to [1960s]?

Screenshot_20230116-171058.thumb.png.a1584f0dc0c820a037e0f1f5758a2a24.png

Edited by wordlover
nospotify
Posted

Any reply from Emby staff?

Happy2Play
Posted

Hard to say without trying to reproduce those Artists.

What shows for SortTitle in edit metadata?

What is the Sort by option set to on that tab?

Sorted to D do to SortTitle

image.png.fdcdd611caec3c267b854f6beac19d16.png

Only has that name on Discogs.

image.png.08bc5e2ba90b6b6670c0962193269fa7.png

Spoiler

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<artist>
  <biography><![CDATA[Hardstyle DJ and producer from the Netherlands.  
Born 25-1-1988]]></biography>
  <outline><![CDATA[Hardstyle DJ and producer from the Netherlands.  
Born 25-1-1988]]></outline>
  <lockdata>false</lockdata>
  <dateadded>2022-09-08 16:49:35</dateadded>
  <title>B-Front</title>
  <sorttitle>DJ B-Front</sorttitle>
  <runtime>0</runtime>
  <genre>Hardstyle</genre>
  <genre>Test</genre>
  <genre>Example</genre>
  <genre>Hard Dance</genre>
  <audiodbartistid>171553</audiodbartistid>
  <musicbrainzartistid>793b8a4d-8f89-445e-a6da-f0ba42e20c8d</musicbrainzartistid>
  <uniqueid type="musicbrainzartist">793b8a4d-8f89-445e-a6da-f0ba42e20c8d</uniqueid>
  <uniqueid type="audiodbartist">171553</uniqueid>
  <uniqueid type="discogsartist">334053</uniqueid>
  <discogsartistid>334053</discogsartistid>
  <album>
    <title>Infinite</title>
    <year>2017</year>
  </album>
</artist>

 

Happy2Play
Posted (edited)

@LukeNo sure without confirmation here from OP but Discoga appears to be a factor here. 

image.thumb.png.3fa965dec208a4f7ea509db62912261c.png

Zedd

image.png.44fe3cf279d444e73df8bffe3defd91c.png

image.png.78833444c55398ad639ec6bdab6cde62.png

Pitbull

image.png.cf1a72d6f028827a645564e17a3f3e1b.png

image.png.46762d47c2399fb369057d62f18acebd.png

 

 

Edited by Happy2Play
nospotify
Posted

Category is "Artist." Sort is by "Title." Metadata screenclips attached below. All have Discogs and MusicBrainz IDs (i.e. successful lookups). But the Titles vs. Sort titles issue is not really related to the original question: why should punctuation marks be used for sorting at all? Shouldn't they be ignored? 

emby3.png

emby2.png

emby1.png

Happy2Play
Posted

Yes the sorttitle is moving items do to characters precede letters.  But @Lukewould have to comment on stripping punctuation, but do not know how many artists have punctuation as proper name.

Not sure why Music acts differently than other content types here as they match the two fields by default.

But looks to be how Discogs adds information.

"Stick" McGhee & His Buddies | Discography | Discogs

Stick McGhee & His Buddies - MusicBrainz

'Ali Naqi Vaziri | Discography | Discogs

علی نقی وزیری - MusicBrainz

 

 

nospotify
Posted

Again, let's keep these two issues separate. How sort titles are retrieved from external databases is one thing. The issue I am flagging is Emby sorting by punctuation marks, which I don't think is how things should work.

Happy2Play
Posted

If it is their name, the punctuation should never be removed.  So the issue is the metadata provider not Emby in my opinion.  

pwhodges
Posted

In another topic I noted that 20,000 in a title was sorted as 20, not as 20000.  This might require special treatment?  (aside from the conflict of continental Europe reversing the usage of "," and "." in numbers...).

Paul

  • Agree 1
nospotify
Posted
1 hour ago, Happy2Play said:

If it is their name, the punctuation should never be removed.  So the issue is the metadata provider not Emby in my opinion.  

I'm sorry for not being clearer. I am not saying "remove the ¡ from Cubanismo's name" I am saying Emby should ignore leading punctuation marks.

visproduction
Posted (edited)

Is everyone aware that fixing this with several regex literals in a js function makes a very noticiable hit on page load times?  User servers are not necessarily optimized for running complex js regex. So, you could easily count off a few seconds on a large list with multiple regex sorts, especially if the complex regex sort happens for every page for every user.  It's possible to cache some sorts, but that takes extra code and would only really help identical sorts that appear often.  If the server was in the cloud, optimized and distributed to multiple locations, then these advanced sort commands can be optimized and the end user could see a fast response.  This is not the Emby architecture which runs on a local server. Maybe it can work, but how beneficial would this feature be if it made every large library page load for every person take 2 to 6 seconds longer.  That is the danger.

The custom sort option, per media entry, solves this issue without extra server load at all.  If you wish to admin your collection, then perhaps be prepared to do some data tweaking to make it run better, rather than asking the software to do everything for you and slow down the user experience.

Edited by visproduction
nospotify
Posted

I don't know Emby's code and what changes might address this problem, nor the issue flagged by several of us about odd-seeming sorting of names/titles that contain numerals. It would be great to hear from one of the devs, as @Happy2Play asked. 

Posted

Hi, we'll take a look at this. Thanks for reporting.

  • Thanks 1
Posted
11 hours ago, Luke said:

Hi, we'll take a look at this. Thanks for reporting.

Hi All,

Punctuation marks impacts sorting in all libraries of the app. Everyone's media is different, but some quick and dirty metrics from my media.

Type                   #

Artists                1

AlbumArtists     1

Albums             26

Movies -              2

 

The items with numbers in the title varies. But they are "text" I don't know the "rules" for the alphabetization of  text numbers.  They are all grouped together where I would expect to see them. 

-vicpa

 

BTW- The UI lets you fix these pretty easily just edit the Sort Name. I guess it depends on how many you have in your libraries.

 

 

 

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...