Jump to content

How to find duplicates?


brallor

Recommended Posts

Happy2Play
1 minute ago, Ronstang said:

can you please show me how you used EmbyStat to display duplicates?

reggi has rebuilt Embystats again and it does not look like this section exists anymore.  @reggi is there a easy way to filter duplicates in the table?

Link to comment
Share on other sites

Ronstang

I hope there is or he puts it back in because it is a very handy statistic for those of us with bigger and growing collections.  I don't always know if I have a movie and it's source.  I would rather just add new stuff and sort out the dupes and the best ones to keep using a tool such as EmbyStat

Link to comment
Share on other sites

Yup, another 18+ months without this feature added/restored... I'm still having to run multiple media servers, and Ruby now, to find duplicates and missing episodes. Sad really but obviously this isn't a feature that is a concern to the Emby devs so it wont be addressed. Good luck Ronstang! 

Link to comment
Share on other sites

Happy2Play
11 minutes ago, CaseyP said:

Yup, another 18+ months without this feature added/restored... I'm still having to run multiple media servers, and Ruby now, to find duplicates and missing episodes. Sad really but obviously this isn't a feature that is a concern to the Emby devs so it wont be addressed. Good luck Ronstang! 

At the same time it is a grey area with Multi-versioning also.  But there are ways EmbyStats provides you a list, you currently have to do the leg work, but I asked on Github.

There Reports plugin can export your entire library to csv or excel you can find duplicates that way also.  But everyone will want something different.

@chef could this be done easily in a plugin by matching say imdb/tmdb ids?  Or a filter added to Reports Plugin?

Edited by Happy2Play
Link to comment
Share on other sites

2 hours ago, CaseyP said:

Yup, another 18+ months without this feature added/restored... I'm still having to run multiple media servers, and Ruby now, to find duplicates and missing episodes. Sad really but obviously this isn't a feature that is a concern to the Emby devs so it wont be addressed. Good luck Ronstang! 

@CaseyP What is Ruby?

 

 

@chef @Happy2Play @reggi Will this be a feature that will be forever forgotten?

Can we get a work around for finding duplicates if EmbyStats will no longer find these duplicates for us?

 

@Happy2Play Are we able to still use the  previous version of EmbyStats before it was rebuilt by reggi?

Link to comment
Share on other sites

1 hour ago, GtownE said:

@CaseyP What is Ruby?

 

 

@chef @Happy2Play @reggi Will this be a feature that will be forever forgotten?

Can we get a work around for finding duplicates if EmbyStats will no longer find these duplicates for us?

 

@Happy2Play Are we able to still use the  previous version of EmbyStats before it was rebuilt by reggi?

Do you mean the plugin?

Link to comment
Share on other sites

pwhodges
4 hours ago, GtownE said:

What is Ruby?

Ruby is a modern programming language, much better designed than many of its competitors.

Paul

  • Like 1
Link to comment
Share on other sites

rbjtech
On 13/10/2020 at 04:55, Ronstang said:

Manually typing it worked but I don't understand why pasting it in didn't work.  I did not add a space at the end or front and I tried it many times.....oh well, it's working now.

Thank you .....everyone here is always helpful

A trick that sometimes works is to paste into 'notepad' or a simple text editor.  Check for a leading and/or trailing 'space' in the text editor before you cut and paste it again (from the text editor) into the browser.   

Link to comment
Share on other sites

If you don't mind shutting down Emby you could get this information running a SQL statement..  If Emby didn't open the database in exclusive mode you could do it while Emby is running.

You could also just shut down Emby, copy the library.db file and restart Emby then use the DB copy to work from.

Link to comment
Share on other sites

Ronstang

That's too much effort....LOL.  I shouldn't have but a few dupes now as I manually went through Emby and purged them all by moving them to offline hard drives.  They are extra copies of the same movies either recorded on the same or different channel so in a sense they are a backup in case there are any problems with the ones currently on the server.  I would just like to be able to run a dupe check every once in a while in case I record and add a movie I forgot I had as it would be easy to take care of as it would be in the latest directory that I store my movies and near the top based on date added.

More importantly I'd like to be able to do a quality check.  I have hundreds of old DIVX compressed horror movies that I would like to replace when the opportunity arises as their quality is lower than anything else on my server.  The only reason I keep them is some are extremely hard to find these days or impossible.  Once I had a replacement on the server it would be easy to just delete them if I could easily do a check for duplicates. 

It is just very tedious to go through the entire web interface scrolling page after page looking for the movies that are dupes, and sometimes I miss them because Emby has downloaded different album art that what I use most of the time.  I like original movie posters whenever possible.

I was really hopeful when I found EmbyStat...but this thread is old and it USED to do exactly what I needed until it was upgraded.  I'd like to run the old version if I could find it anywhere.

Link to comment
Share on other sites

Happy2Play
17 hours ago, GtownE said:

Are we able to still use the  previous version of EmbyStats before it was rebuilt by reggi?

All you can do is test as every version has its own quirks.  But yes a previous version should work as I just tested with 0.2.0 beta20, this has a little different layout but there is a duplicate movies list on Movies screen.

Link to comment
Share on other sites

PenkethBoy

You have the tools to do this already with a small amount of work - till its added to emby or somebody does a plugin etc - but i dont think its necessary

To do this simply and quickly without having to do too much faffing about - you have two options or basic approaches depending on what you want to compare or accomplish.

1. You want to compare the "movie names" that appear in emby

2. Or you want to compare movie paths\filenames

===============================

 For 1 - i'm using Excel 2007 so options might be in a different place on newer versions. 

Use the Emby report plugin - and export the movie list to a csv (ignore the excel option as it complicates things) - either in chunks if you have a huge collection (append them in a text editor) or as one list

open the csv in excel (or other similar prog)

then select column A, find "text to columns" on the data tab in excel - choose delimited text - click next and choose ";" then click finish - and OK in the dialog that will pop up.

Now you have a series of columns with the data in a more readable and useful format.

Final step - select column E -> "Name" column and on the excel Home tab - pick "Conditional format" - pick New Rule -> "Format only unique or duplicate values" (by default it will pick duplicates) -> click the format option -> Fill Tab and pick a highlight colour - say Yellow as it will stand out. Ok a couple of times and you are done.

574883406_Annotation2020-10-15051618.jpg.f56a815b4f121ec19ad5e74c7a4f3445.jpg

So if you see no yellow highlighting you have no duplicates.

You can obviously do this for the paths - but i would suggest you split the paths down using the above text to column function so you can target directories etc. see below.

=======================

For 2

You can use the Path from the csv above and do the same as for 1 - but there is an alternative way outside of emby

go the windows explorer and navigate to the root of your movies folder structure - e.g. m:\movies

in the search box top right type - "kind:=video" press return and wait - windows will now search for any file it considers a video - depending on the speed of your machine and number of videos it might take a few minutes to get the list. - Wait!

488926835_Annotation2020-10-15053147.thumb.jpg.cbe7e12d13841ce58a2e25f0d4c75ff1.jpg

Next in explorer click "view" and change to "details" and sort by name as it will be a random list by default ( order in which windows found the files)

1658810707_Annotation2020-10-15053231.thumb.jpg.a90eb3a4b99b823a6b260dfbafa43991.jpg

Select all - Ctrl+A and then hold down the shift key and right click on the files - you will now find that you have a new option "copy as path"

1639787591_Annotation2020-10-15053603.jpg.61978e608322f8a5595da4e162031a74.jpg

Open a text editor (not Word!!!!) and paste in the data

741620460_Annotation2020-10-15054611.thumb.jpg.18efee196a37e480c3ac2debbcffe26d.jpg

use the find and replace option and remove e the double quotes around the paths - then save as a csv file

1043210593_Annotation2020-10-15054706.thumb.jpg.2f667949b5f04e537bd970e6f4c6bec7.jpg

then follow 1 above to split the file paths up and use a "\" as the delimiter via the "other" option and compare the elements of the path as you need

1854863396_Annotation2020-10-15054257.thumb.jpg.484faa9e992df385f580044fd07b0252.jpg

This takes a lot longer to read than do - its simple data manipulation - not difficult or time consuming for the few times you are going to do this

you could setup a template in excel to deal with the data to streamline the process a bit - but for me its not worth the effort.

So simples!!!!! :)

Have Fun

 

@cayars another KB article for you to do :) - copy away old boy!

 

 

Annotation 2020-10-15 053603.jpg

Edited by PenkethBoy
  • Like 1
Link to comment
Share on other sites

That's way to complicated for me. :)

I prefer to shut down Emby since it uses the database exclusively (make this an option)
Run batch file which creates a text file of all dupes (excluding 3D and 4K)
Start Emby.

Could even be added to the nightly routing if you want to restart Emby each night.

Link to comment
Share on other sites

PenkethBoy

not complicated at all

if anything for the average user messing with a db via sql and a script is more complex and black box then using things the average user can use!

and you can do it with emby running as has minimal impact other than running the report plugin

:) 

Edited by PenkethBoy
Link to comment
Share on other sites

rbjtech

.. surprised to see the use of a GUI there @PenkethBoy ..  😃

Even with my antiquated scripting skills - getting the list of mkv/mp4 files for comparing is a two line batch file ..

 

for /f "delims=" %%a in ('dir /a-s /b /s *.m??') do (
    
echo %%~nxa >> filelist.txt 

)

Powershell will probably do it in one line .. ?

 

Edited by rbjtech
Link to comment
Share on other sites

PenkethBoy
4 hours ago, rbjtech said:

@rbjtech

Powershell will probably do it in one line .. ?

 

Yep - for the washed with added brain cells

 

Annotation 2020-10-15 153834.jpg

Edited by PenkethBoy
Changed Thread Name
  • Like 1
Link to comment
Share on other sites

rbjtech
6 minutes ago, cayars said:

Problem with those solutions is you can't use them against 40 drives of data. :)

Why not ? - just use an input file telling it where your data is and let it do it's thang ..  I'm attempting to do that as we speak - it's won't be pretty, but it'll work ...👍

edit - ah - you means duplicates across 40 drives - with you !  No, it can't do that .. currently..

Edited by rbjtech
Link to comment
Share on other sites

Actually 112 drives.  Yes looking for duplicates across all the drives.

With SQL using the Emby Database it doesn't matter how many drives you have.  You can also easily filter out 3D and 4K movies if you like or only look for dupes in those.

If Emby didn't take exclusive use of the databases (unlike a certain competitor) it would be really easy to run SQL scripts against the databases for many things like this.

@Luke any chance we could have the option of NOT opening the sqlite databases exclusively (maybe system.xml setting)?

Link to comment
Share on other sites

yaksplat
23 minutes ago, cayars said:

Problem with those solutions is you can't use them against 40 drives of data. :)

you can with the ruby script.  I did it across 24.

Edited by yaksplat
Link to comment
Share on other sites

Well you can do it with any script language .  You just need to build a master list of all the content but then you can't easily eliminate 3D or 4K material unless you keep it in different locations.

But it's still not easier IMHO then just running a SQL statement against a database that already has all the data including movie name, location and it's attributes.

Link to comment
Share on other sites

PenkethBoy
51 minutes ago, cayars said:

Problem with those solutions is you can't use them against 40 drives of data. :)

yes you can - just not being creative enough

and one solution is pooling software

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...