/t/ - Technology

Discussion of Technology

Index Catalog Archive Bottom Refresh
+
-
Options
Subject
Message

Max message length: 0/12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

CAPTCHA
E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.



8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

You may also be interested in: AI

(4.11 KB 300x100 simplebanner.png)

Hydrus Network General #11 Anonymous Board volunteer 02/05/2025 (Wed) 22:34:45 No. 17183 >>17234>>17593
This is a thread for releases, bug reports, and other discussion for the hydrus network software. The hydrus network client is an application written for Anon and other internet-fluent media nerds who have large image/swf/webm collections. It browses with tags instead of folders, a little like a booru on your desktop. Users can choose to download and share tags through a Public Tag Repository that now has more than 2 billion tag mappings, and advanced users may set up their own repositories just for themselves and friends. Everything is free and privacy is the first concern. Releases are available for Windows, Linux, and macOS, and it is now easy to run the program straight from source. I am the hydrus developer. I am continually working on the software and try to put out a new release every Wednesday by 8pm EST. Past hydrus imageboard discussion, and these generals as they hit the post limit, are being archived at >>>/hydrus/ . Hydrus is a powerful and complicated program, and it is not for everyone. If you would like to learn more, please check out the extensive help and getting started guide here: https://hydrusnetwork.github.io/hydrus/ Previous thread >>>/hydrus/22247
Edited last time by hydrus_dev on 04/19/2025 (Sat) 18:45:34.
https://www.youtube.com/watch?v=Xqg2KSdBLRU
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v608/Hydrus.Network.608.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v608/Hydrus.Network.608.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v608/Hydrus.Network.608.-.macOS.-.App.dmg linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v608/Hydrus.Network.608.-.Linux.-.Executable.tar.zst I had a good simple week. Just a bunch of small fixes and improvements. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights The new mpv error hook that detects the 'too many events' error is overhauled. It shouldn't fire unless there is a serious problem now. If you were hit with false positives, sorry for the inconvenience the past few weeks! After the recent successful 'group namespaces (user)' update in tag sort, and after talking with some users, I have decided to make that namespace grouping always sort the same way, no matter which way you sort the underlying groups of tags. This means, e.g., that 'creator' tags will stay at the top, then 'series' and 'character' etc.., whether you say 'most first' or 'fewest first'. See how you like it--I think it is more intuitive now. I improved how the multi-column lists across the program set their minimum and initial height, and in most cases and styles they appear to be pixel-perfect to the number of rows desired. Particularly, in the gallery and watcher downloader pages, where the multi-column lists dynamically size themselves based on how many items they are holding, it should fit exactly. I find the perfection a little eerie, so I may add just a few pixels of padding so the back of your brain knows there isn't anything to scroll down to. next week I finished the duplicates auto-resolution database work this week, so all that's left is a daemon to run everything and a preview panel. I'll see if I can get the daemon done.
just wanted to say: keep up the great work!
(51.36 KB 540x540 borg.png)

Just wanted to say: hi
Would it be possible to add an undo to pages? I sometimes think I have my browser selected and press F5, only to wipe a page on hydrus because the search is empty and I've been manually dragging files to the page. A searchless page could also work.
>>17188 This used to happen to me all the time, so I just deleted the shortcut.
>>17189 Not a bad idea.
>>17186 >>17187 Thanks lads, keep on pushing. >>17188 Yeah, I've done this myself. I have a barebones undo system, but I never wrote the original 'do a thing to a page' pipeline to plug into it correctly. Ideally, you want this chain recorded of all changes, and all changes tell the program what they did and how to undo them, but stuff like 'remove files' command just gets swallowed right now. I've slowly been converting my internal objects to better support this, but it'll be a while yet before I can support anything I'm happy with. On the other hand, I'll be more immediately working on some better tools to 'sync' a page to its current results. If you are a long-time-enough user, you'll remember that 'open these files' pages used to have no search. They wouldn't respond to F5 either. I replaced them with a page that spawns with a system:hashes search, for KISS reasons, and while you can flip the 'searching immediately' button to reduce problems, we are now broadly missing the ability to have a 'read only' page that just holds some files mate. I'll be figuring out a checkbox or something that freezes or pins a page as it is so guys like you don't lose their work on a scratchpad page.
In the duplicate filter, when you compare videos, it seems to always give the longer video a much higher score. it seems to give it around 50 points, with no way to change it. this happens even if the "longer" video is a single frame longer. this is probably a bug
Is there an option somewhere, or can we get it by default, that you get a notification for when cookies that are necessary for certain downloaders expire? >>17192 >keep on pushing. I will.
I had a good week. I mostly did boring cleanup work, but by serendipity I figured out a set of fun new file sorts that work using a file's average colour, so you can now sort from lightest to darkest, grey to saturated, and by hue. The release should be as normal tomorrow. >>17194 Thanks, I fixed it for tomorrow to do 1-50 depending on the relative lengths, capping at twice the length. That's probably still a bit wrong, so let me know how it feels and we'll keep adjusting. >>17198 The login system can kind of do this, except when a cookies is expired/missing, it then wants to start a login process. I'm not sure you can set up a login script that recognises a missing cookie but then won't cause domain errors when its empty or whatever login process inevitably fails. Maybe it is ok if it fails, since jobs on that domain will probably just cycle on 'login problem' delays until the cookie is filled in or you re-validate the login script for another attempt. Bit bleh all round though. Maybe I should consider starting an overhaul of that system, which I haven't touched in ages and is really starting to creak. If you use Hydrus Companion, it can regularly sync your cookies and you avoid these issues, but if you have to do it manually, in your situation, I think you might have to monitor it manually too. Maybe you can resync every two weeks or something. Sorry!
>>17200 >Maybe I should consider starting an overhaul of that system, which I haven't touched in ages and is really starting to creak. It's not a huge issue for me. Most of my cookies last for years, and the nearest expiring one is 4 months away.
>>17200 >grey to saturated, and by hue. This is immensely useful for tagging things as "black and white/monochrome/greyscale" or whatever tag one might use for that. It may also be able to sniff out things in other color scales for easy tagging. An image that's entirely in shades of blue is almost as lacking in color as an image in black and white.
Can I do a search of files that are only in multiple domains. Under 'edit multiple locations' it seems that I can do 'domain A' OR 'domain B', but not 'domain A' AND 'domain B'.
>>17203 Not sure what you really want, but it seems you want either 1. Search files that for example must be in domain A AND also in domain B. For that you should change your domain to one with a broad scope, for example 'all my files' or better 'all local files' (Advanced Mode), which would also contain trash. Then do several searches with 'system:file service: system:is currently in my files system:is currently in test location That way it searches for all files that are in 'my files' AND 'test location' at the same time. 2. You want to activate the checkboxes in 'edit multiple locations' to take a look into those domains at the same time and think you can't. Hydrus let's you activate the boxes that make sense. If you are not in Advanced Mode, you might have 'my files', 'all my files', 'trash' and maybe you added a new one, say 'test location'. If you check 'all my files', you can't check 'my files' or 'test location', because 'all my files' already contains them. You can check 'all my files' + 'trash' though at the same time, because 'trash' is not included in 'all my files'. In Advanced Mode there are much more. If you check 'all local files', you can't check all mentioned above, because it contains them all. You can for example check 'my files' + 'test location' + 'trash' at the same time, because you didn't check a parent that contains them already like 'all my files' or 'all local files'. Also this might help: https://hydrusnetwork.github.io/hydrus/advanced_multiple_local_file_services.html
https://www.youtube.com/watch?v=A1QUUmShhMM
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v609/Hydrus.Network.609.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v609/Hydrus.Network.609.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v609/Hydrus.Network.609.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v609/Hydrus.Network.609.-.Linux.-.Executable.tar.zst I had a good week. I mostly did boring cleanup work, but there's some neat new colour-based file sorting to try out. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html macOS zip First off, the macOS release is now in a zip. We've had technical problems with DMG creation recently, and zip is much simpler and faster and more reliable. You'll now be double-clicking the zip to extract the App and then dragging it to your Applications directory yourself. I am recommending all macOS users consider running from source, since it skips a bunch of App hassle, and particularly so if you are on Silicon (the App is intel atm, which causes a very slow load time on Silicon). https://hydrusnetwork.github.io/hydrus/running_from_source.html colour sorting A user mentioned that the 'blurhash' we recently added stores a file's average colour. I played around with the idea and figured out some fast 'sort files by colour' tech. It works for all files with thumbnails. They are all under a new 'average colour' menu, and there are five types: 'lightness', which obviously does dark to light; 'hue', which orders files according to the rainbow colour wheel you will have seen in any HSL colour palette panel; 'chromatic magnitude', which is basically saturation and does grey to colourful; 'balance - blue-yellow', which uses the idea of 'chromaticity' and scales from the bluest files to the most yellow; and 'balance - green-red', which similarly does greens to reds. This isn't perfect, since it works on average colour, so a file with a very bright red blob on a dark background might be counted the same as a brown mush, but it is cool and fun to play with, and it is a good base for future, cleverer colour histogram tech. I'm really happy with how it turned out. Give it a go and let me know how it works for you. next week I did a bit of duplicates auto-resolution work this week, but I wasn't super successful. I'm going to do a bit more UI so I can get a better idea of how I want the daemon to work.
I made a new pixiv subscription and it failed to get any 18+ files, so I sent the logged in cookies via Hydrus Companion, tested that I could grab 18+ files by sending a tab via Hydrus Companion, and that worked, but deleting and remaking the subscription just gave me the same single SFW file, no 18+. What am I doing wrong?
>>17206 Figured it out. Had this issue before. Cookies don't stick right when sent from an incognito tab.
Some minor issues with the manage notes window that make it hard to use: >Whitespace at the start or end of the note is removed. Whitespace is also removed from the start or end of each individual line. Also, if note has multiple newlines in a row, even in the middle of the note, they will be turned into just one upon pressing apply. I want to use spaces to indent and I want to be able to have multiple newlines in order to organize information nicely... >If a file has multiple notes and you have the option "Start editing notes with the text cursor at the end of the document" unchecked, it only works for the initial note that was opened. Switching to another note tab will put you at the end of the document instead of the start. >It would be nice if when you pressed escape or cancel to leave the notes popup and you've made edits it would warn you like "unsaved changes will be lost. continue?" sort of thing. Please take a look. Thanks.
(1.26 MB 3000x2500 cuter.png)

>>17205 >colour sorting Thanks, this is useful indeed.
Speaking of notes, hydev, could we get a mass edit option for notes? Or a least a mass delete?
>>17204 >1. Search files that for example must be in domain A AND also in domain B. > >For that you should change your domain to one with a broad scope, for example 'all my files' or better 'all local files' (Advanced Mode), which would also contain trash. Then do several searches with 'system:file service: > >system:is currently in my files >system:is currently in test location > >That way it searches for all files that are in 'my files' AND 'test location' at the same time. This is what I wanted, thank you. Trying to find files that are in BOTH file domains, so I can remove them from the domain they do not belong in.
>>17208 >Whitespace This is a tricky one. The notes import workflow relies heavily on a bunch of merge tech, and it is really useful for me to trim and collapse excess whitespace so notes from different sources can be compared well. That said, all this merge tech worked out pretty shit tbh and I hate it, so it wouldn't kill me just to make the text trimming smarter and add some options so users can do what they want. I'll see what I can do! >bad option Thanks, I will fix it! >warn on cancel Great idea! >>17213 This is doubly tricky, since the UI gets suddenly complicated and I need to do (even more!) merge tech to show how the files share notes and stuff, but I'll see what if I can make it work like the multifile URLs editing dialog, where it has some red text or something saying 'hey, careful, you are editing for a bunch of files here, all will be affected.' Some way to say 'add this note to the files that don't have it yet' would also be nice.
Are there any plans to make it so you can click and drag to select multiple files?
I just installed Hydrus in MX Linux and the MPV viewer won't load. The library libmpv2 is installed and even I upgraded it with the version in the Test Repository but I had not luck. Anyway, I'm just posting to inform devanon what and where MPV is failing, as I switched to hydrus from source and it is working great, see pic 4.
Just checking in to see how much of a crack dream reverse PTR + IPFS is With booru enshittification and the repeated attempts at making porn illegal I feel something like it may become necessary
>>17224 I'm no Linux expert, but I've seen that 'g_module_open_full' thing before--hit the 'Linux' tab here and scroll down a bit: https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#installing Looks like you will be adding some .so files to your Linux install. I guess it is something to do with Ubuntu not agreeing with your Linux flavour about where some things should be. Failing that, running from source can help with these weird dll load problems since you'll then be completely native. >>17222 Yes, I would like this. I do it myself by accident sometimes, forgetting I still haven't added it. My thumbnail grid is waiting an important overhaul, after which I expect it will be much easier to insert this sort of behaviour. >>17225 IPFS use a slightly tricky hash/identifier for files--iirc it is a bit like a torrent magnet, in which all the blocks of the content have separate hashes, and the hash of the master file that lists those blocks is the main 'id'--and, as I understand, you just cannot predict an IPFS hash with only an SHA256 hash. And I'm not up to date, but I understand you cannot query the IPFS network using a hash like an md5 or sha256. So, while it would be possible to do this, you'd also need a way to propagate (SHA256-IPFS multihash) pairs so our hydrus db would be able to cross-reference tags and stuff to an IPFS multihash that would get you the file (so you could, say, do a search for files you don't have but which have the tag 'samus aran' on the PTR and have a known IPFS multihash, and that list could spawn a new downloader that would work pretty much like a booru gallery search). I've thought about this, and I generally call it a PUR for Public URL Repo, or basically a 'hey this repo stores known locations/ids for SHA256 hashes', for which IPFS multihash could be one mapping type we could add. Writing a whole new Repo would be a ton of work, and I am still of the mind that managing files remains the trickier problem to solve than sharing them, but I agree we've seen some more restrictions recently. I'm mostly concerned with the big CDNs putting anti-downloader captchas on many sites. And I'm worried AI is going to break captcha and make Real ID the standard for anything with a web domain. We'll see how it goes, but yes, I think there could be a way forward here if we wanted to. The Client API, also, is much more mature these days and is open to someone who had an interesting idea. I'm toying with the idea of client-to-client comms in the back of my mind, and I wouldn't be surprised if I slowly pivot over the coming years to a cleverer client, likely with AI tech to solve common problems like tagging character names and simple nouns, and retiring or minimising the server. We'll see how everything shakes out.
I had an ok week. I ended up mostly working on some boring cleanup and refactoring, so there's not much exciting to talk about. There's new file viewing statistics stuff and some misc quality of life. The release should be as normal tomorrow.
Hello! I'm having an issue with a custom parser. In cases when it tries setting two notes with different names but the same content, only one of them gets associated to the entry. Test parse always shows them correctly and separately, and they both always get associated properly when they differ in content. Could it be a bug?
>>17228 I was talking with another guy, I think by email, about this exact same issue this week, but we resolved it, so I presume you aren't him. My note merging logic isn't something I'm totally happy with, but it has a bunch of cases where it says stuff like 'if we are adding a name+text, and the name already exists and the note that comes with it is a prefix of the new text, then replace the existing text' in order to stop it adding name(1)+text duplicates. A similar thing basically says 'if the text we want to add exists anywhere, then don't add it, since it'll just be a dupe'. This stops you getting the same gumpf for the 'danbooru artist comments', 'gelbooru artist comments', 'e621 artist comments' etc... It isn't super beautiful, but it saves some spam. So, it isn't a bug. That said, I don't like it! And when the two duplicate note texts come in the same parse job, it pseudorandomly selects which will be the first. I've got several ideas and heard several user opinions on how notes should change going forward, some radical, but I am not sure what I want. How would you, in an ideal world, like to handle the overall problem of duplicate note names and/or texts?
https://www.youtube.com/watch?v=lUVCJGSTcO8
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v610/Hydrus.Network.610.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v610/Hydrus.Network.610.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v610/Hydrus.Network.610.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v610/Hydrus.Network.610.-.Linux.-.Executable.tar.zst I had an ok week mostly doing boring cleanup. There's some quality of life improvements. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights The new 'sort by hue' file sort puts anything that is a very grey average colour at the bottom of the sort. The 'edit notes' dialog has some nicer focus logic, and it will warn you on cancel if you have made changes. The menu settings under options->file viewing statistics are simplified. What was previously five different ways to show viewing stats in the file right-click menu is collapsed to two, and there's now a checkbox list of which view types you are interested in showing. The new 'client api' view type is integrated into the UI, and some views/viewtime system predicate parsing is easier to work with. I overhauled the parsing pipeline this week, and I fixed some weird parsing bugs and bad test UI presentation and so on. If you are an advanced downloader maker, let me know how things go. next week I got mired in cleanup this week and did not get to several things I wanted. I'll keep working my immediate small job todo and try to clear things out a bit.
hydrus that allows hardlinks/symlinks instead of copying files when? i don't care if the database breaks sometimes damnit
I think there is a bug when i 'sort by tags: number of tags'. So when i match the domain i am looking into in the search pane (the one under the OR button) to the 'tags' button domain (next to collect/leave unmatched button), let's say both are changed to 'downloader tags' and 'ascending', then the sorting is correct. I suppose you have to match them, because that way the tags you see are the tags you sort exactly. The first files have no tags, then when i hold the right arrow key, i can see how the tag list gets more and more populated with tags and the scroll bar in the tag list gets smaller and smaller. That is how it should be. But when i match the domain of the said buttons to everything else except 'downloader tags' (-> my tags, PTR, all known tags), the sorting doesn't work anymore and is all over the place. In average it might get more and more, but from file to file the number goes up and down. Now i realized how to fix it though. If i change 'sort by tags: number of tags' to a namespace sorting, a second 'tags' button appears right above the first 'tags' button. This is actually the one that changes the behaviour in the 'sort by tags: number of tags' sorting (which is not visible there), that means 'downloader tags' was activated here, so it coincidentally worked in 'sort by tags:number of tags', which i described in the first paragraph. So we would need THIS 'tags' button there instead of the one which is there, which seems it doesnt do anything in terms of sorting by number of tags. I think this 'tags' sort button is linked to the collection buttons left of it, so this changes the sorting of 'collecty by' collections. In short: how to sort by number of tags for 'all known tags'? If it works for you like for me in paragraph 2, then try another domain. Does it still?
>>17183 (OP) >weekly releases How do you (or others) keep the flame of motivation alive?
I just realized I had closed my page with many watchers 3 days ago. - restoring a db backup would lose my changes - exporting all files downloaded or modified is very complex and would not necessarily save everything - "last session" backups happened every 5 minutes, so there were not enough of them - apparently sessions cannot be exported - I couldn't find where they are stored in the databases. Not sure I would have been able to use it anyway. - the nearest "exit session" was 6 days old. I chose this option.
>>17229 Thanks for the reply! >How would you, in an ideal world, like to handle the overall problem of duplicate note names and/or texts? Don't know about an ideal world but the way I expected it to work was to simply add a note with the specified content even if it's duplicate content. Since it's a custom parser I personally don't have to worry about spammy notes. I just want to have notes created with the content that they get from the data if it exists. It's not the end of the world since the data is technically preserved but it's a bit annoying that I don't know which of the fields that I'm scraping was actually populated. Maybe adding an option to note content parser settings to have that specific one not get deduplicated?
Oh, and another question! Are the databases between hydrus platforms interchangeable? Can I go from using the windows client, move the data to the docker image and then if need be return with the updated data back to the windows client without data loss?
>>17238 I switched from running the windows binary in Wine, to running the Linux binary, then running on Linux from source, all with the same database, so it should be fine
(1.97 MB 1146x534 python_M0S1QKv4Ay.mp4)

>>17231 Not any time soon, sorry! Maybe in the future when the file storage system is more flexible and tolerant to error and can deal with multiple potential locations for files (i.e. handles less certain locations for files). Also, I don't have a Linux machine to test with atm, and Windows linking isn't nearly so capable, so I'm sure I would fuck it up. For now, I have to mandate KISS here. >>17232 Thank you, well spotted and great report. Yeah, the basic problem is that the 'thumbnail grid' object isn't kept up to date with the current tag domain, so when I tell it to sort by tags in some way, we either have to specify a specific domain, like I added for namespaces, or it will default to 'all known tags'. I simply need to do some cleanup so it understands the current tag context, like the 'selection tags' list does, and make this work more naturally in normal situations. I can add the tag service widget to the 'number of tags' sort, but I'd rather not do that first since it is a mild pain in the ass and it is probably pretty obscure to specifically want to search tags in one context but sort in another--most of the time I imagine people want to sort in the same domain as the search. >But when i match the domain of the said buttons to everything else except 'downloader tags' (-> my tags, PTR, all known tags), the sorting doesn't work anymore My bet here, btw, is simply that your 'downloader tags' service tends to have more tags than anything else so it lines up with 'all known tags' more often. In reference to your broader tests, if I set the file sort to 'num tags, descending', but then change the tag domain, as in vid related, the sort stays the same. I'm pretty confident it is just sorting by 'all known tags' the whole time. I've reminded myself to do this cleanup/overhaul.
>>17234 I went through a handful of crash-out drama bombs in my youth, and now I have a better handle on what causes burnout for me. Mostly it is working in teams; I simply cannot do it without sperging out and working too hard trying to keep everyone happy and then resenting the whole thing and quitting in a big to-do. So, I now work strictly alone. There's a handful of guys I trust to do pull requests on simple extensions or whatever, but I never 'collaborate' on code in any meaningful way. As for keeping on doing it every week, I am someone who always did his homework the night before. If I let myself schedule something for two months from now, that means six weeks of doing nothing. By having a tight loop and a hard regular deadline, I'm able to keep myself productive and I can also fix my mistakes quickly. I'm not professional and I throw out garbage code regularly. I don't know how to turn myself into a conscientious person, so instead I just sling a ton of mud at the wall and fix it when I screw up. For the actual motivation, I don't know. I enjoy helping others (I am beholden to this, in fact), and the anonymous imageboard community is my home, so that part all seems to line up. I'm very stubborn, and I avoided many opportunities to quit simply by keeping my head down and pushing on. I can't stomach becoming a normalfriend, and I am also supremely lucky to be in a personal situation where I am able to do this, so I'm free to dump an indefensible amount of time into this weird project, hahaha. If you are 22 and you find it difficult to stick to a difficult project, don't worry. All the emotional stuff just gets easier as you get older. >>17235 Well done. I have done this myself, same deal just by accident, and then noticing days later. I should add a yes/no confirmation any time you close a downloader with stuff in it. >>17237 Thanks. I think you are right--more options in 'note import options' is going to be better than half-hidden logic. >>17238 >>17239 Just to confirm, yeah, 100% portable. The db directory is the same no matter your platform. Only thing that can fudge an 'installation migration' is if there are hard-saved absolute paths for your import/export folders, or custom file storage locations, that point to ~/blah or G:\blah, which the other OS may have trouble with. But it won't break anything permanently--hydrus will generally just moan about it and let you fix it to the corrected path.
(71.21 KB 500x293 autism meter.gif)

>>17241 >For the actual motivation, I don't know. I do know.
is there a way to let things files slip through the tag blacklist on a case-by-case method? Kinda like how you can click 'try again' in the file log of the url downloader page and redownload previously deleted files? Basically there's an image I want that has a tag I've blacklisted but I don't want to mess with the list, just want the one particular image.
>>17241 Perhaps I should try and set myself some deadlines. A couple of years ago I was pretty regularly making progress on my projects, but I had to abandon them when the job search was getting tough. Now I'm hoping to pick back up a project I had probably put about 4-6 months into before. Part of my problem is that I'm usually feeling quite tired during the week, another part is that it's been kind of hard to come home from 8 hours of work on a computer and start more computer work. But I've been thinking about the project more lately. Been hoping to be struck with "motivation" or "inspiration" to get back in the habit of working towards my goal. The big problem is getting started again I suppose. Anyway, thanks for the response. Always good to hear from those who work on cool projects.
>>17240 >I imagine people want to sort in the same domain as the search Yeah, basically i THINK it would be better when i change the domain in the main tag domain button (let's call it like that) and then want to sort for number of tags, the thumbnails sorting should change according to the tag domain chosen in the main tag domain button, but right now it changes according to the upper 'tags' button that appears when changing the sort to namespaces. Without that button, the num tags sorting is broken (since the thumbnails dont change like in your video), but as you said, i would also rather not have that button because it makes things more complicated imho. I guess that's exactly what you said :D >>17240 >In reference to your broader tests, if I set the file sort to 'num tags, descending', but then change the tag domain, as in vid related, the sort stays the same. I'm pretty confident it is just sorting by 'all known tags' the whole time. Not sure if i get what you are trying to say since it is complicated and hard to wrap my head around all the sorting systems of hydrus (there are basically 3 tag domain buttons when sorting by namespaces!), maybe you are correct, but i dont think it is always sorted by 'all known tags', since changing the domain in the upper 'tags' button that appears when sorting by namespaces, AND THEN going back to num tags sorting, the thumbnails changed, but from there the main tag domain button doesn't change the thumbnails anymore like in your video. So when i change the upper namespace 'tags' button to 'my tags', then going back to num tags sort and change the main tag domain button also to 'my tags', the tag list shows the correct 'from 0 tags to most tags' behaviour. That is also the case for every other domain but the upper 'tags' button and the main tag domain button have to match. Thats why i think it is not always 'all known tags' but most probably i just dont get it :D Sorry for the wall of text, you probably figured it out already anyway, just want help as much as i can so you might find a fix faster.
>>17245 Thanks--you are completely correct, and I didn't read+test properly. I did this cleanup work yesterday evening and it looks like I wasn't doing 'all known tags', but some internal value that happened to default to that but was overwritten in the case of the namespace stuff. I've got the thumb grid updating to the current tag domain now and the 'number of tags' file sort obeys that. Let me know if it gives you any more trouble! >>17243 I'm pulling back from 'exception' workflows since I'm on a big KISS kick, so I don't think I'll go in this direction. I've got many longer term plans to make selecting your 'import options' much easier, with favourites/templates you can load up real quick, which I think is the better answer here. If you could select a tag import options that didn't have the blacklist and then try again in like three clicks, that's nice and natural and doesn't shake up the workflow with complications. Unfortunately that will take much longer to get to so it doesn't fix your problem. Sorry! If it is just the one file, I think make a new urls downloader and set some specific tag import options, clearing the blacklist. Now I think of it, I can and should have an easy way to reset a blacklist you are looking at, so I'll see about that. >>17244 Good luck! I read sometimes about people who coded up some clever new game engine 'in their spare time' after coming home from work, and I don't know how they do it. If I had to do a nine-to-five every day, I simply wouldn't have the energy for hydrus. I am very very thankful that I don't have to. Maybe a deadline will help your situation, but if you force it I could see it leading to its own sort of burnout. Maybe you can reserve Sunday afternoons or something, and then at least something is being done, but you aren't slaughtered. I'm crazy about todo lists, and if I have a project summary like 'add video to the hydrus duplicates system', it is never getting done. I always have to break it down into eight sub-jobs and then those into scratchpads with twenty thoughts. I feel like the back of your brain knows the rough size of problems, and if it is too much you just discard it for something more immediately satisfying. But 'convert this vague project idea into smaller chunks' is itself doable in half an hour, and then you got something done and next time you work you have a small list that is again easy to get into and have some measurable progress on. In the inverse to what I said before, I feel like when I was twenty, I could code until 3am every night hahaha. Just endless energy and enthusiasm, but no emotional control. Although, all my code back then was total shit, so I guess it is not an honest memory.
I have a bit of a problem, and a feature request that should solve it. When you have a subscription or a gallery downloader that has post pages, and those post pages can contain files, there an issue where, if a file exists in multiple posts, the downloader will only check the file for the first post it sees, and basically ignore it for every post afterwards in the downloader. I get why this is done, but I big problem with it is that if that post had metadata (like tags and urls) that needed to be added to the file, the downloader won't do it because it didn't check the file again, due to it already being in the file log. If it's not too much trouble, could you add an option for downloaders to add the file as a child of a post, even if that file already appeared in the downloader under another post, so that it will grab the metadata from every post that the file is in, regardless of which order the downloader sees them in? It'd make managing my files a lot less confusing, because currently there's a lot of holes in the title tags and post urls of my files from certain artists, due to this issue. I hope this makes sense, and thanks!
>>17248 It sort of depends--can you tell me a little more about these duplicate/subsequent URLs? Is your situation: A) The same file is appearing multiple times at the same URL? B) The same file is appearing multiple times in different URLs? If A, like you have a sub that is failing to re-queue the same URL twice, or a downloader you are running twice and it refuses to queue up the same thing it saw once again, then check out the file right-click menu->urls->force metadata refetch->blah urls. This is a common problem, wanting to catch up to new metadata that has been added to URLs that were first downloaded from a while ago. However, that doesn't sound like what you are describing. So, if B, can you tell me broadly about the 'shape' of the downloader you have here? Are we walking through a gallery search that is producing a list of Post URLs, or is this some complicated nested thing? Is this something like Kemono, where you are parsing all sorts of files from a richly formatted post, so it can turn up over and over? It isn't Pixiv is it--I mean, Pixiv can have multiple files per post, but you don't get the same file in different places, do you? If the same file appears in multiple posts, hydrus would normally grab the metadata from the different URLs, even though it recognises the file has been seen before. It typically only recognises that it knows a file once it has hit the (new) Post URL, by which point it has parsed tags and stuff from the rest of the document, and it'll add it anyway even though we are 'already in db'. The only circumstance I can think where it won't is if you are somehow parsing hashes or maybe source URLs at a preliminary stage, probably the gallery parsing stage with a subsidiary page parser, which is causing hydrus to skip hitting the Post URLs to get additional metadata. Maybe you can tell me a bit about what you see in the 'notes' column for these 'already in db' results in the 'file log' for the downloader. Does it say 'url', 'hash', or 'file' recognised? OR am I misunderstanding, and this is a problem within the parsing step itself? Are you parsing separate posts within a single document, and somehow I'm deduplicating URLs somewhere? Or do you have Post URLs inheriting metadata from their parents, hence why you want child creation?
I had a good week. I fixed several annoying bugs and added some quality of life. The release should be as normal tomorrow.
>>17257 >So, if B, can you tell me broadly about the 'shape' of the downloader you have here? >Is this something like Kemono Yes! It's kemono.su that I'm having this issue with for downloaders and subscriptions. So it looks like this. There's a gallery that has posts, and the posts contain multiple files. If the same file appears in multiple posts, it will have the same basic url (from from what I can see, they sometimes or maybe always have a different "f" querry parameter, but I think Hydrus strips that anyway, and I'm not sure if it's always there). >If the same file appears in multiple posts, hydrus would normally grab the metadata from the different URLs, even though it recognises the file has been seen before. It doesn't seem to do this if it already has that exactly file in the current log. it'll only grab the urls that aren't already in the log. I just check with a post right now, and it said "Found 1 new URLs" but there's actually 2. it's just that the second url is already in the downloader's file log, so it seemingly just skips adding it as a child of that post, and thus skips adding the tags and urls and such from that post to the file. >It typically only recognises that it knows a file once it has hit the (new) Post URL, by which point it has parsed tags and stuff from the rest of the document, and it'll add it anyway even though we are 'already in db'. If I'm understanding you properly, the issue is that it's not adding the file url again and saying "already in db". It's instead just not adding the file url again AT ALL, because (I assume this is why) it's already if the file log, so it just silently skips it. That's the issue that's causing the metadata from only one post the file's in to actually be added, I think. >Maybe you can tell me a bit about what you see in the 'notes' column for these 'already in db' results in the 'file log' for the downloader. Does it say 'url', 'hash', or 'file' recognised? There is no notes column for the file url because it's not getting added the second time at all. It only gets added from the first post that the downloader sees that file url, then afterwords it just silently doesn't add it again as a child of any future post urls for that downloader. >OR am I misunderstanding, and this is a problem within the parsing step itself? Are you parsing separate posts within a single document, and somehow I'm deduplicating URLs somewhere? Yes, I think this is exactly the problem! Hydrus is silently deduplicating file urls in the log, which is causing it to not properly get the metadata of the file at that url for each post (and post url) that contains that file. Instead, it's only getting the metadata for the first post url that it sees that has that file, and then afterwards, it just pretends that any future posts that have that file url... don't have it. Like it's not there, so nothing gets added, and the metadata for those files are incomplete. >Or do you have Post URLs inheriting metadata from their parents, hence why you want child creation? I have file-urls inheriting metadata from their post-url parents. But I need the file url to be added to the file log by the downloader for each post-url that it appears in, so that hydrus can see and add the metadata from each post that the file appears in, to that file. Even if that exact url has already appear in the log, it's going to have different metadata attached to it, depending on which post url it appears under. I know I was kinda repetitive here but I just wanna make sure you understand what's happening properly. Thanks a lot for helping! This is a really annoying issue for me and I'd love to see it fixed!
https://www.youtube.com/watch?v=QYWR-jF7Mek
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v611/Hydrus.Network.611.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v611/Hydrus.Network.611.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v611/Hydrus.Network.611.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v611/Hydrus.Network.611.-.Linux.-.Executable.tar.zst I had a good week. There's a bunch of fixes and quality of life improvements. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html fixes I think I fixed the long-time issue where a search page that was scrolled some way down would often reset to the start position while you were on another page! Another long-time bug, a perplexing one, where the media viewer could on rare files suddenly not update on content changes--for instance, you'd click the rating, but nothing appeared to happen--is fixed. The update was being fired off correctly (you could watch the original thumbnail get the rating or archive or whatever), but the media viewer was just briefly confused on how to update itself. It basically happened when a file maintenance job ran on a file in a media viewer list after the media viewer had booted. Just an unlucky thing. Should be all fixed now, but let me know if you run into any more trouble. I fixed a couple of Drag and Drops--from thumbnails and page tabs--where if you released your click at precisely the wrong time, the DnD would stay attached to your mouse despite not having any mouse buttons down. The program now recognises this and cancels the drag. The 'apply image ICC Profile colour adjustments' option under the media viewer's 'eye' menu is more reliable. It wasn't clearing the caches properly, before, and so the old image was sometimes lingering unless you zoomed etc.. It seems to immediately update now, every time. new stuff If you have a search with a system:limit, changing the file sort will now refresh the query. When you have system:limit, most sorts will be able to fetch the nth 'tallest'/'smallest'/'reddest' files, so changing the sort will now resample from the larger pool of files at the source. You can turn this off under options->file search. Also, the file sort widget now says in a tooltip if it is able to do this database-level system:limit clipping (the complex ones can't). The thumbnail grid now sorts files by 'number of tags' according to the current tag domain--set it to 'my tags', it'll be different from 'all known tags', etc... If you try to close any non-empty downloader page, the program now throws up a yes/no confirmation dialog. You can turn this off under options->gui pages, where you can now also, if you prefer, set to confirm all page closes. In an experiment, you can now map the modifier keys on their own. They'll show as 'ctrl+nothing' or 'alt+shift+nothing'. They seem to work, but I can't promise it'll be fine everywhere. For parser makers, I added a STATIC formula, which simply spits out the same text every time, regardless of the actual parsing text. Might be useful for fixed, parser-based additional tags and so on. next week Back to duplicates auto-resolution.
>>17264 >fixed the long-time issue where a search page that was scrolled some way down would often reset to the start position while you were on another page! Funny, I was actually using that to quickly see if my (very full) subscriptions downloaded new files; https://xkcd.com/1172/ finally happened to me. Thanks for the work as always!
Hey hydev I've been having some issues with my Linux machine and I think I found what is probably a bug in Hydrus. When my issues crop up Xorg dies, and hydrus doesn't recognize the Xsession has gone away and keeps running, I have to sigkill hydrus if I want to reopen it. I haven't been reopening hydrus much recently since this issue is pervasive and I'm paranoid about my db getting messed up so I don't have much in the way of tests. So I'm sure if I'm just impatient or hydrus can't properly shutdown, but a nice shutdown with sigterm doesn't seem to work. Hopefully it is me being impatient but hydrus itself doesn't shut down on it's own, that I know for sure. No clue how it works but if possible could you please add a "nicely shut down if the X server dies" feature?
Do you recommend autoupdating hydrus with something like scoop? Or just keep using one version as long as it works?
>>17264 >I think I fixed the long-time issue where a search page that was scrolled some way down would often reset to the start position while you were on another page! Thank you so much!!!
>>17264 >The thumbnail grid now sorts files by 'number of tags' according to the current tag domain--set it to 'my tags', it'll be different from 'all known tags', etc.. >>17246 > Let me know if it gives you any more trouble! Awesome! Seems to work without issues! Big step in the right direction. That being said, i played around a bit and found, that collections maybe could also have a rethink and some similar change. Not sure if it's possible or if people need the current behavior, so tell me if my suggestions make sense: Change sorting to 'sort by collections: number of files in collections' for easier understanding of my problem, then 'collect by creator' for example. Also search for 'creator:*anything*' (because i don't know if the change could apply to 'system:everything/inbox/archive' searches). Now if you change the main tag domain button to something else than 'all known tags', the number of the collections in the thumbnail grid seems weird. For example, let's say i have 203 files with only the 'creator:artistA' tag in 'my tags'. I would expect the grid to show 1 collection with 203 files, but it shows me 4 collections: a 200-file-collection + a 1-file-collection + a 1-file-collection + a 1-file collection, even though the creator tag is the same for all of them (only creator:artistA). That happens because the three lone 1-file-collections have other creator tags in other tag domains like PTR, therefore also in 'all known tags' which the grid seems to rely on. Looking into 'all known tags', each of the 1-file-collections has artistA in combination with another creator tag, that the others don't have. Thats why each combination stands for themselves, means if a file has two or more different creator tags, it creates one collection. So what i would like: in 'my tags' the thumbnail grid would show only one collection with 203 files, because they have all only the 'creator:artistA' tag. Same collection behavior for all other domains that are not 'all my tags'. In 'all known tags' the grid behaves correctly at the moment imo and shows a collection for each creator combination. Also ones that arent in 'my tags' at all. I think for 'system:everything/inbox/archive' and some other system predicate searches that wouldn't be possible, since those show the same amount of files all the time, no matter what tag domain you chose. Here the number of collections and the number of files within the collections are always the same, but probably that's by design (because they don't rely on the tag list when sorting? Just an assumption by a layperson). With 'system:number of tags > 1' the grid behaves like with a 'creator:*anything*' tag search though again.
>>17272 >means if a file has two or more different creator tags, it creates one collection Correction: can of course be one different creator tag too. Important is, that each combination of creator tag/s creates one collection, which is easiest seen in 'all known tags'.
>>17264 >I fixed a couple of Drag and Drops--from thumbnails and page tabs--where if you released your click at precisely the wrong time, the DnD would stay attached to your mouse despite not having any mouse buttons down. The program now recognizes this and cancels the drag. I never encountered this issue, but I'm now getting false cancels when I try to drag and drop stuff, resulting in me having to do it several times sometimes. Can I toggle this back?
IT KEEPS HAPPENING
>>17257 >>17262 >within a single document To clarify, I think I might've misunderstood you here. If by "single-document" you mean "within a single parsing step" then no, it's not multiple posts' file urls being added simultaneously. These are separate post urls that have some file urls that are identical to each other. The post urls are added by the gallery search, then the file urls are added by entering each post url.
I don't know what I keep pressing that does this, but how do I un-highlight the query box? Whenever this happens, I can't enter a query with the enter key anymore and have to clicking search results from the list instead. I've been fixing this by duplicating the current page and removing the old one.
(3.81 KB 223x103 Untitled-1.png)

Something off in the logic here. Same number of frames, same fps.
(15.48 KB 332x659 Screenshot (209).png)

Just updated from 606 to 611. One of my gallery downloads has been giving me really off statistics. This was a new download and none of these are "already in my database" nor "deleted" and only displaying 392 images I just downloaded(I did delete a few so its probably a little over 400). I even did a search on the artist and only 392 images show up. Even stranger is I've did a couple of gallery downloads before this one after updating and they all showed up fine with no issues or saying they are already in my database/deleted.
>>17262 >>17277 Thanks, I think I got you. If an import object--which is a single row in the file log--parses multiple URLs to pursue, it creates child objects, so you'll get: POST URL 1 POST URL 2 POST URL 3 converting to: POST URL 1 (successful, created three children) - FILE URL 1 - FILE URL 2 - FILE URL 3 POST URL 2 POST URL 3 And the problem is that when we hit a later Post URL, it wants to create a child for FILE URL 2 or something, and the file log says 'already hit that, bro', and skips adding it as a valid new child. I'm sorry to say, now that I think about this, that I can't see a clear and nice solution. My file logs are very strict about not duplicating data (which is mostly a backstop to avoid loops, but the fundamental data structure of a file log relies on each row having a unique key (URL), so I can't really write a hack in here to allow the dupe). I also don't have the ability to go 'we already did that here, but let's merge in our new metadata and re-run the write content updates job'. The add event is simply skipped, just as you say. I could think about a long-term solution where I say 'don't process children until the parent layer is done, and then figure out content update merge instead of skipping, but there are good reasons to process children immediately for sites with timeout tokens. So, this is a dumb and not excellent answer, but I think you'll have to hack it on your end for now. Mainly you want to trick the file log into not realising they are the same URL. Change the parser so it puts a meaningless extra bit of data on the File URLs you parse. This could be a parameter, let's say based on parent post id, that the server won't error out at receiving: If you are getting something like: https://n1.kemono.su/data/f7/ab/f7ab95bec4a7eaac9b5e5c6e80c5b9fd38443a7ac75b0e593e7c3c9ab0290152.jpg?f=nRGFiR1UNGX7St2H7uSbJQ33.jpeg and convert that to https://n1.kemono.su/data/f7/ab/f7ab95bec4a7eaac9b5e5c6e80c5b9fd38443a7ac75b0e593e7c3c9ab0290152.jpg?f=nRGFiR1UNGX7St2H7uSbJQ33.jpeg&whatever=9465655 Then hydrus will think it is a different URL. EDIT--But wait, now I see a real URL here, I see the 'f' you mentioned, and that it is already being stripped, and that could be our key here. No need to add a spurious new param if we can figure out why the always-unique 'f' is disappearing. So, go to ''manage url classes' and see if that file url matches anything. Maybe it is matching a kemono Post URL by mistake. If so, make a new URL Class just for the File URL (or perhaps edit an existing one) and add the 'f' parameter so it knows not to remove it when normalising. Unmatched URLs do not have 'excess' parameters removed on normalising, so I'm fairly confident your File URLs here are matching something. Once you can get the File URLs normalising with a bit of tokenised unique data, the file log should allow them. Let me know how you get on!
>>17268 Thank you for this report, and I am sorry for the trouble. Can you try the two sigterm tests under help->debug->tests, do not touch? One of those is real, and one just calls the thing a sigterm is supposed to call. Do either work for you under normal conditions? Trying to wrestle this thing into receiving system and session shutdown calls is hellish sometimes, I regret it isn't cleaner. I try to just say 'hey when the system tells me it is shutting down, I obey', but it seems things are not always so simple, and I think my legacy code is still firing in the wrong thread-order, despite many iterations of cleaning the shutdown code. I'm currently planning on implementing full 'this program is stopping shutdown' logoff stuff working on the Windows side of things, since I know I'm not setting the right flags for it and I need to do more research, so I wonder if I can get that working correctly, it'll help on yours too. If you boot hydrus from a terminal and hit ctrl+c on the terminal window, does that work? It should fire a quick shutdown that skips some stuff, including no shutdown splash window. I think that works as a certain sort of SIGINT, but I'm not super confident on the Linux side of things. >>17269 I know some guys have scripts that wangle auto-update with chocolatey or even winget. You can do whatever you like, with the sole proviso that any script you make must integrate a backup sync beforehand. As long as you make a backup, then it doesn't matter what fucked up thing could happen in the update step, including where I post like 'hey, every two years or so you have to do a clean install, and we are here again, so here are the special install instructions', you haven't lost or damaged anything. Tbh if you want clean auto-update, I'd say run from source and then just do git-pull once a week, again after backup. It takes like three seconds to run, and then you might want to rebuild your venv once per six months or so, which I make easy with the setup_venv script. https://hydrusnetwork.github.io/hydrus/running_from_source.html >>17272 Thanks. I agree. Now I've got the thumbnail grid aware of the current tag context, I need to deploy it to pretty much all tag computation on that side of things. I also want the thumbnail banners to filter to the current context. I hadn't thought of collection slicing, but that's another one to do. I'll see if I can do some this week, let me know how it goes. >>17274 Thanks, can you explain more what goes wrong? The current process on my side is: If the user has been moving the mouse with click down from the drag source for more than 100ms, start the drag and drop. If, 100ms after that drag starts, the mouse is now up but the drag is ongoing, cancel the drag. This second part is what I added last week, because there was a marginal moment you could release the click right at the 100ms moment and things would get confused and the drag would start even though the button was already up. So, are you just doing a normal drag of a tab/file and ~100-200ms after the start it suddenly cancels? Despite you having the mouse button down? Do you have anything funny going on with your mouse, like you have primary/secondary clicks swapped or anything? Maybe I am testing 'Left down' the wrong way in your case.
>>17275 Shame all around. I don't know enough about the Arizona political situation to guess what will happen next. It looks like the governor vetoed the same basic bill a year ago, so if the legislature are trying again, are they just posturing or did they change the language to satisfy her? My engineering brain says that you make kids safe by applying strict whitelists to kid electronics. A blacklist whack-a-mole on adult electronics is never going to work and only going to fuck with privacy. But when these bills pass and the proponents cheer how the sites now have reduced traffic across the board, you realise what it was really about. I know, I know, gasp, shock, and horror that politicians might be duplicitous. If it passes, I guess traffic will collapse, and on our side of things the hydrus downloader will break. Maybe Bad Dragon can spin e621 out to its own entity and move it to a more friendly state, but I always figured the site was run at a loss and it is really advertising for the core product, so it probably isn't feasible. All the more evidence that offline storage and management is useful. You simply cannot trust the cloud to be there in five years. That said, if e621 survived because of parent subsidy, and it had an outsized market share because of that, then its absence will presumably allow for a more healthy ecosystem of multiple normal boorus operating in different places and with their own tag taxonomies and so on. Might be a smaller space overall for appreciators of obese farting anthropomorphic lion drawings, but a less brittle one. >>17282 Probably 'insert' key. It enables some IME stuff for inputting kana and kanji etc... You can remove this under shortcuts and the 'tag autocomplete' set. >>17285 Yeah, that's odd. Normally the janitors have been removing all the 'clothing' stuff. I know 'thigh-highs' in particular has had a bad loop in it, so maybe they are fixing something. I don't see anything in the janitor chat, so give it a day or so and see if they did something already, otherwise propose a sibling change putting 'clothing:thigh-highs' to, say, 'thigh-highs', and I should think it'll be accepted. I'll make a job to check this in a few days. Might be something got fucked somewhere because of the history of this complex tag and I need to do something. >>17287 Thanks. I bet the duration difference is tiny, just a few ms. I'll have a look at the thresholds. >>17288 Thank you for this report. If this is very private, you don't have to post the log itself, but can you tell me more about the 'shape' of the information here in the file log? Do you have anything like this: URL | successful | (found 2 urls) Sub-URL | successful | (already in db) Sub-URL | successful | (already in db) URL | successful | (found 1 urls) Sub-URL | successful | (already in db) URL | successful | (found 2 urls) Anything like that, where some of your extra rows might be some 'container' URL that is spawning children, so you have multiple rows per file? Or is it a normal booru-style log where it is just a flat list of URLs that were executed on, one file per object? Or, do any of the URLs have weird token parameters, like 'http-url?tokAuth=eou4o954u9o5e4u95oe46uo'? Sometimes you can get the same URL twice because of some weird parse, but hydrus thinks it is different because of some unique value like this. Sometimes you get an 'already in db' result for something that was in the queue earlier, which sounds like it might fit your situation here. If those lines say 'already in db - was seen 6 seconds earlier', that's typical for this. And, what in your 'file import options' of that page are the 'presentation options'? Does it say like 'all files | must be in inbox | all my files'? Sometimes the presentation options cut down what you actually see from the master list by excluding previously archived stuff. If you click the little down arrow on the file log button and hit 'show all files in a new page', is it still 392?
>>17290 >Thanks, can you explain more what goes wrong? Testing it some more, it only happens when attempting to drag things outside of the Hydrus window. I will still see a little plus sign attached to my cursor for a split second even after the mouse is outside the Hydrus window, then the it vanishes. I'm not releasing the click, and my mouse works fine otherwise except for some occasional issue with automatically right clicking that I have yet to pin down, probably because it's a cheap mouse, but that's rare and this is consistent and repeatable. It happens about half or more of the time right now.
>>17285 This is why I manually tag damn near everything, don't use the PTR, and don't use siblings. This mess seriously fucks with my autism.
>>17292 Bro what the fuck is going on, I just tested what you describe and it is happening super reliably if I drag and drop out the main window before the second 100 ms timer hits. Somehow the 'is left down' state is not testing correct when the cursor is out of the client area. I will check this, sorry for the trouble! Slow-initial drags seem to fix this for now.
(992.80 KB 250x250 sensible chuckle.gif)

>>17294 >Bro what the fuck is going on It amuses me to see you use this sort of language given how you normally post.
>>17295 A raw look at what UI code does to me. An endless series of 'why the fuck is it doing that?'
(11.95 KB 239x545 Screenshot (210).png)

>>17291 >Thank you for this report. If this is very private, you don't have to post the log itself, but can you tell me more about the 'shape' of the information here in the file log? Do you have anything like this: Never mind, I think I found out why. Its another Pixiv nonsense thing again. When it came to those image set collection Pixiv does, this specific artist would add samples + commission prices inside his collection. Say an image set with 20-25 images in the collection. More than 2/3 would be samples + commission prices. This artist basically reused those samples + commission prices in every collection. I deleted the first batch of samples + commission prices images thinking it was a one time thing and hydrus kept picking those up, adding to the file count. Different URLs, same hash. Some of which were downright identical. tl;dr Nothing wrong with Hydrus and just me being dumb and not paying attention.
>>17290 Say if an update messes something up it should only be the .db files that need to be restored/backed up right? The same library files (the pictures) should still work I assume?
>>17300 yeah the db files are where the update happens. It should be safe to run a backup of just those files right before an update, then simply restore those if it goes wrong
not really important but I just started using hydrus companion again after trying it once loooong ago and disusing it since it slowed my browser to a crawl for whatever reason but using now not only speeds up my workflow, I just noticed the URL database checking thing and that's super fucking helpful damn I should not have slept on retrying companion it's funny, cause I did the same thing with hydrus it self back in like '17 or '18, tried it a bit and hated it only to come back like 2 or 3 years later and see how amazing the program is
I've been testing more tools for a while now, but may I ask if there are plans in the future to run scripts automatically on import? Right now, whenever my subscription downloads archive files, I use hyextract to manually extract and reimport the data then deleting the original files, which is a real pain the more time I do it. This would be useful as well with AI tagging such as WD. Any thoughts on these?
>>17290 >Can you try the two sigterm tests Sure: >self sigterm Quickly closed gui, took about 20 seconds to clean up >self sigterm fake It also quickly closed the gui, also took about 20 seconds to clean up. >terminal C-c Prompted if I was sure if I wanted to shut down, I let my thread downloads finish before retrying. It also prompted for maintenance work, which I skipped. Took about 5 seconds to do some final cleanup once the "saving gui and quitting" job finished.
I had an ok week. I mostly did behind the scenes work, so the release just has some bug fixes. The release should be as normal tomorrow.
Has "parse advanced predicate string" ever worked correctly? (A && B) || (C && D) a OR c a OR d b OR c b OR d A xor B a OR b -a OR -b
>>17311 I have a duplicate filter task that maybe I should not be doing, because of the volume, but here it is. I have thousands of video frames, and I want to find the best ones. Each frame has a "frame:number" tag. I want to compare pairs of adjacent frames and use "this is better and delete the other", "same quality", or "related alternates" on them. That is, I want the duplicate filter to compare only 1 and 2, 2 and 3, 3 and 4, 4 and 5, 5 and 6 etc.
>>17314 those are both correct what's issue?
>>17316 Huh, guess they are. a OR c a OR d b OR c b OR d (a && a && b && b) = (a && b) (a && a && b && d) = (a && b && d) (a && a && c && b) = (a && c && b) (a && d && b && b) = (a && b && d) (a && d && c && b) = (a && b && c && d) (a && d && b && d) = (a && b && d) etc It's so unreadable though.
>>17314 >>17316 >>17317 Yeah, unfortunately hydrus can only handle Conjunctive Normal Form (AND of ORs) atm, no ORs of ANDs, so that parser takes arbitrary logic and outputs CNF for the normal search system to process. I'd like to figure out ORs of ANDs sometime, but the main barrier is just that it'll need some loop-de-doop new edit UI.
https://www.youtube.com/watch?v=LgCRbxXY-fs
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v612/Hydrus.Network.612.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v612/Hydrus.Network.612.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v612/Hydrus.Network.612.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v612/Hydrus.Network.612.-.Linux.-.Executable.tar.zst I had an ok week. I mostly worked on behind the scenes stuff, so I only have some bug fixes to release! Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights All Jpeg-XL files will be scanned for a bunch of metadata like 'has transparency' and 'has exif'. If you have any weirdly rotated JXLs, they should fix themselves soon after update. I fixed a couple bugs with the recent drag-cancel tech, which stops some weird drag states occuring. The test was firing falsely when dragging thumbnails very quickly outside the window, and it was throwing some errors when kicking in on macOS. Export folders handle sidecars a little better. Missing sidecars are filled in on every run, and if you have a 'synchronise' export folder, it is more careful about not deleting existing sidecars. next week I moved the duplicates auto-resolution system forward this week, and it is now pretty much all in place. The database module and daemon are live--but as-yet unpopulated. I now just have to overcome one more difficult hump, a nice preview panel so we can see the impact of a pending auto-resolution rule, and then we'll be ready for final integration and testing and letting advanced users try it out. I'm feeling good about it all and will continue.
(14.11 KB 1150x135 shownamespaces.png)

Do the "Unless" look to you like they were intended for "Hide namespaces"? The items are uncheckable only if the first checkbox is unchecked.
(49.63 KB 687x519 sixty niggers 2.jpg)

Is there any way to search specifically for a tag, and only that tag without any siblings or parents? I mean like "solo_female" without "solo female". Or vice versa, only searching for "solo female" without any of its siblings. Or do I have to kill the sibling association each time?
>>17299 Great, no worries. >>17300 >>17301 Yeah, just the db files are changed in an update. If you use FreeFileSync, you can have it compare file-size/file-modified-date to calculate a backup. This runs very fast for an incremental backup job since it recognises all your media hasn't changed and will only overwrite your db files and a few thumbs maybe and add/delete any new/deleted media files. A neat trick, then, is you can simply tell it to reverse the operation and it'll do a complete rollback that is equally as fast and comprehensive. I do a backup before I update every week, so if it goes wrong, I can just rollback and it'll do the db files for me. >>17302 I'm glad you are enjoying things now. I've been doing a lot of optimisation over the past few years, and a bunch of API stuff is working a bit faster, I think, just because my own UI code isn't so clogged and it has free reign to fetch URL statuses and stuff freely. Let me know if things ever start getting laggy again. >>17307 Yeah. Please do a search in this thread and in >>>/hydrus/ for 'exe manager' or 'executable manager', and you'll see me talking about my general plans here. I want it, and it is becoming more important with AI autotaggers and upscalers and stuff, but it'll be a big project. >>17308 Damn, so that side of things seems to be fine, although 20 seconds feels too long. I'll do some work and see what I can do to better plug into the session signals, since it seems like hydrus isn't catching the SIGTERM here, or pre-SIGTERM 'hey session is closing are you good?' stuff, exactly as we want. This stuff isn't always easy to test, but let me know if the overall situation changes in future please. >>17315 Interesting problem! I don't think you'll be able to make a duplicate search that does 'namespace:n', 'namespace:n+1'. If you are not super strict about the n,n+1 requirement--i.e. what you really want to do is examine similar frames--then I'd just let the system work as normal, since it will recognise that frame 5 and 6 are very similar anyway. It might say that 5 and 7 are similar too, or 5 and 57, so if that is a dealbreaker then not. If you are very strict about only comparing neighbouring frames, I think, I don't know, you'd probably have to try and figure it out through the client api. Something like: https://hydrusnetwork.github.io/hydrus/developer_api.html#manage_file_relationships_remove_potentials - do this on all the frames, dissolving what hydrus has already recognised https://hydrusnetwork.github.io/hydrus/developer_api.html#manage_file_relationships_set_file_relationships - do this programatically on each n, n+1 pair https://hydrusnetwork.github.io/hydrus/developer_api.html#get_files_search_files - this to help you find the n, n+1 pairs' file_ids Then, I think the duplicate filter would present you the pairs as you want. BUT as you said that some were better quality, let's say you said 2 > 3, then 2 would represent 3's spot against 4, so you'd be presented with 2,4 anyway. I think my overall suggestion is to try this on a smaller sample of frames the natural way. Just run them through the filter, let's say without deleting anything, and see if the answer is anything like what you wanted with the n,n+1 idea. Just letting the duplicate filter find and compare pairs is probably going to be KISS here. >>17320 Thanks, bad grammar, I'll fix it. >>17324 Not yet, but I'd like to introduce it in the not too distant future. The PTR janitors want the same thing, and a bunch of debug situations want the splitting of fine differences here. It'll come in the form of a system:predicate that searches tags specially according to tag display context (with or without siblings) and tag service (i.e. other than the current tag domain if you want).
Hello. Using Hydrus but new to anything involving programming. I'm downloading high quality tumblr images via the "export your blog" function and at the same time saving the standard tumblr image with the url as the file name, adding file names as tags on import to hydrus then using the duplicates search to just have the urls linked to the higher res images. It sounds complicated but it is a million times better than having to filter through tumblr exports which are downloaded out of order and manually find the url for each of them while keeping track of what you already have. The problem is that you can't save an actual url as a file name because of the forbidden characters. However I have a Danish keyboard with acess to the keys æ ø å which never show up in urls (I browse english language tumblrs). So I just replace forbidden characters with them (I created a word macro for this) and for multiimage posts with the same url I just add å at the end of the file name to have unique filenames. And then just adding "source" as the filename namespace. For instance, if I wanted to save the images from this post on nasas tumblr https://nasa.tumblr.com/post/775384136434302976/love-letters-from-space I would end up with my hydrus images being tagged with: source:øønasa.tumblr.comøpostø775384136434302976ølove-letters-from-space source:øønasa.tumblr.comøpostø775384136434302976ølove-letters-from-spaceå source:øønasa.tumblr.comøpostø775384136434302976ølove-letters-from-spaceåå And so on... (and for tumblr art you can easially add a regex to tag the artist from the url) But then this still has to be converted. Is there a way to automatically change certain characters in tags to other ones? Like tag siblings or regex but only changing certain characters. Replace ø with /, æ with : and remove å.
>>17327 This was simpliefied a bit for readbility, the actual tag is uncluding https as part of the file name.
>>17327 I think that instead of trying to solve this weird painful problem, you need a better workflow for getting images into hydrus. Sidecars would be very helpful. Tell me more about the "export your blog" function. I thought you could only use that for your own blog?
>>17327 Nevermind, I think I see what you're doing with "export your blog" - you're reblogging a post you want and then exporting your blog, right? Unfortunately, that doesn't come with any metadata files that you could import with the image. Luckily, that's not the only method of getting the highest quality images. I tried making a parser, but tumblr does some annoying things that make it hard. Instead of reinventing the wheel I think the better way to do this would be to just download the images with gallery-dl and import them with sidecars. 1. Download gallery-dl: https://github.com/mikf/gallery-dl 2. Run it with the option --write-metadata to get information about each image in a nice json file: gallery-dl --directory "C:\gallery-dl\tumblr" --write-metadata https://nasa.tumblr.com/post/775384136434302976/love-letters-from-space This will put the files in a folder under C:\gallery-dl\tumblr. 3. Create an import folder in hydrus for that folder: https://hydrusnetwork.github.io/hydrus/getting_started_importing.html#automation 4. Create sidecar routers: https://hydrusnetwork.github.io/hydrus/advanced_sidecars.html#json_files The name of the blog is under "blog_name", the url is under "post_url", and the post timestamp is under "timestamp". That's all you really need.
Id like to ask for a new feature. Could we have some arrows to be able to skip to the next/last entry (thumbnail) in the 'manage tags' dialog? Right now we need to close the dialog, then do the right-click procedure again if we want to compare tags/number of tags (if you are a mouse-only user) or use F3 to open/close the dialog. The idea is, a click on the 'right' arrow would go to the next single entry, even if you had multiple selected before. Or maybe also without arrows, just to click on another thumbnail so the dialog gets updated to the one thumbnail selected? That way you could also select several thumbnails again and the dialog would update accordingly. But right now it does that windows clinking sound if you try to click the thumbnails behind the dialog, so it doesn't work. I needed to check for some files the number of tags in different tag domains, but it's a bit tedious to close and open the dialog again and again, even with F3, so that feature would be great. I don't know if anything would speak against it, from a usability perspective. But i also use two monitors, so idk...
>>17334 It's only one file at a time, but you can do that in the full size file viewer. Just know that tags entered and deleted there are applied immediately, rather than when you close the window via F3 or the apply button. The apply button actually doesn't exist in the tag editing dialogue window for the full size file viewer.
I had a great week. I made much more progress than I expected on the duplicates auto-resolution work, fixed some bugs, and figured out a new system predicate to search tags in a bunch of advanced ways (e.g. on a different tag domain, without siblings, in deleted/petitioned domain). The release should be as normal tomorrow.
>>17335 Oh thank you for the reminder! I actually used that sometime and forgot about it. Will help alot in many situations! :) For the others, i hope Hydev can look into my suggestions, especially for live updating of the dialog for multiple thumbnails. So you can click/select them and also maybe scroll up/down in the thumbnail viewer with the mouse wheel. The question would be, should the manage tags dialog stay in the foreground all the time or should there be the option to 'minimize' like other dialogs can, like the 'migrate tags...' dialog just as an example. Default it stays in the foreground but you can minimize it if you want. I think that would be the best.
>>17336 >The release should be as normal tomorrow. Forgot to say, thanks boss!
>>17336 >on a different tag domain, without siblings, in deleted/petitioned domain based I'm still not sure about auto-resolution because it could munch a bunch of my imported AI slop images that have/don't have AI metadata, since there's still no predicate about specifically that (and metadata doesn't always mean that it's not just photoshop exif garbage)
I'd love it if, in the duplicate merging options, there was an option to "copy from worse if none are on better" option for URLs. So it only copies URLs if there aren't any already, otherwise I get some images that have zero URLs despite having an online source, or many which are just an unholy mess of various URLs of dubious quality, few of which actually serve the same picture.
>>17337 Oh and to add another idea: I also would like hydrus to be able to sort for 'last tagged time'. I don't think hydrus records times when you tag a file, correct? If you tag a bunch of files after a complicated search, then delete the search and you would like to check for exactly the last tagged files, that would be a good way to find them fast. 'Last viewed time' doesn't work for that if you don't look at them in the media viewer. But im not sure if that would work retroactively (i don't think hydrus records tag times behind the scene?) and also if there should be a general tag time that considers all tag domains or per tag domain. Because when the PTR updates, the time would update also for all the updated files. But since the thumbnail grid is aware of the current tag context now, per tag domain would probably be better, and you could still change to 'all known tags' to make it consider all tag times. Maybe it's too much bloat, but just see it as another suggestion that could come handy sometimes :p
>>17319 I can't seem to drag to reorder tabs after this update. Is there a setting I'm missing?
https://www.youtube.com/watch?v=iyl3otdvXH4
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v613/Hydrus.Network.613.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v613/Hydrus.Network.613.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v613/Hydrus.Network.613.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v613/Hydrus.Network.613.-.Linux.-.Executable.tar.zst I had a great week. There's a few fixes and a new clever system predicate. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html system:tag (advanced) We've wanted a tool to do things like 'search for files with just this tag, no siblings' for a while. After recent conversations with users, I had some good ideas and decided this week to try and just knock it out. It went well, and advanced users are invited to try it. Let me know where it fails! So, in the normal system predicate list on any search page, you'll now see 'system:tag (advanced)'. You type a tag in manually, no autocomplete, and can set: - Which tag service to search in, if different to the current search. - Whether we should include siblings or just search for that tag exactly. - What 'status' the tag has, be that 'current', 'pending', 'deleted', or 'petitioned'. - Whether the file should have this tag mapping or not. This adds several new tools, all pretty clever. Some queries may run very slow, but I think it all works, and I've integrated fast cancel tech. I didn't have time to add it to the system predicate parser, but I planned for it and will try to get it done next week. I may add a proper autocomplete input to the panel, too, for quicker tag entry. next week I also had success with the duplicates auto-resolution preview panel I had been dreading. I was expecting to have to do a bunch of rewrites, but I banged my head against it for a bit and a solution popped out. The new dialog now asynchronously fetches a rule's search and presents the pass/fail thumbnail pairs in a couple of tables. I'd like to do a bit more--adding a regular media viewer to 'zoom in' on a pair, and only sampling results when the rule has a lot of pairs--but the difficult part is solved. With this, all of the complicated work in duplicates auto-resolution is finally done. I now just have to do final integration and testing. I am going to keep pushing, so with luck, I expect to launch the first version of this system for advanced users to try in v615. >>17342 Did your options->gui pages->BUGFIX: Disable all page drag and drop get switched on by accident?
Anyone else got their hydrus deleted by microsoft?
>>17345 Happened to me in one specific version 10-20 versions ago too. But only because my regular anti virus did have some problems and turned itself off, so microsoft defender was activated. My regular one doesn't put hydrus into quarantine. But i also didn't test the two latest versions of hydrus yet.
>>17345 >Threat blocked >Detected: Trojan... This is a years old issue. You see, Hydrus has an embedded server in its code and Windows confuses it with a trojan. Of course it is a false positive. To my knowledge, the procedure to avoid triggering the anti-virus is that devanon has to pay from his pocket to Microsoft an audit of the code to get it into a White-List. Naturally that's outrageous and never will happen. So, two choices for ya, either you White-List Hydrus in the anti-virus settings, or, you switch to a different operating system. >microsoft Ahem. Respectfully... you've got what you deserve.
(915.72 KB 3074x1659 Gelbooru Shutdown.png)

Gelbooru admin is vagueposting about shutting down.
>>17343 Ah it turned out to be some strange windows issue. For some reason i could reorder chrome tabs but not firefox tabs
>>17343 >system:tag (advanced) I'd like most of all to do something like system:has tag in "all known tags", ignoring siblings, with status pending: "*:*_*" So I can more easily filter out all the trash tags before uploading to the PTR. Even better would be an "apply all siblings relations to pending tags" that would convert everything within the search to its sibling equivalent.
Jesus fucking christ why is sankaku such a fucking piece of shit hydownloader is failing to download from it again giving "os error" What does that even mean?
Has anyone been able to solve tbib's mass-downloading of dummy files? I can clear the file log from my subscription panel but it would keep downloading (=ignoring) the same amount of files.
The age of hoarding is about to be over Cloudflare stops all my hydrus attempts now
>sankaku now have daily login rewards Can someone explain to me why is that fucking shit site still being used
>>17334 >>17337 Thanks. This is actually tricky to navigate, which is why it is so awkward now. The main thing is with a 'modal' dialog (one which locks out the rest of the program), we have the idea of apply and cancel. If a user makes changes in the dialog but then backs out, we roll them back to what we had when we started. If, broadly speaking, I allow the user to continue using the rest of the program while the dialog is open, then I'd have to deal with hypotheticals like them opening up two different dialogs on the same file, or using a shortcut to apply some tags to a file after the initial dialog was open--in that case, my dialog is no longer neatly 'atomic', and we are dealing with a more complicated situation to rollback to or commit from. We already live in that world a little bit, since downloaders can apply tags to files in the background while a dialog is open, but it generally hasn't proven a big deal so far. Anyway, that's why the manage tags panel is an immediately-committing non-modal 'frame' when launched from the media viewer, but a modal apply/cancel thing when launched from (potentially a whole selection of) thumbnails. It saves me a ton of time and potential headaches, particularly with manage tags, which is pretty complicated and CPU heavy behind the scenes. There isn't a superb solution other than me knuckling down and writing better and cleverer code. I can't promise this will get better too quickly, but I do first want to make the manage notes, urls, ratings, and anything similar work like manage tags in the media viewer. I want you to be able to have smooth editing+navigation with one or more windows open off the media viewer, and as I develop that unified solution for multiple dialogs, I will keep all this in mind. The manage tags dialog pulls off the duo modal and non-modal capability with a bunch of hardcoded hacks, so first I will pull all that tech out and smooth it out and generalise it for any dialog so I can apply it to the notes and urls and stuff. Once the code is cleaner, I can think about make it complicated in a smarter way >>17341 >I don't think hydrus records times when you tag a file, correct? Yeah. I think this is a good idea. We have pretty decent timestamp management these days, and while I don't want to track every possible change, I'm all for a 'last time the human edited tags on this file on any service' timestamp and just hooking it into the same search and sort routines as something like 'last viewed time'. Yeah we can't fill this data in retroactively. I don't know when things were tagged in the client atm. >>17339 I will write up some proper help as this system launches, but I'll say here: this system will be highly user-configurable and always off by default. I aim to have a bunch of decent templated rules that users can try for common situations, but you'll always be the one pulling the trigger, and you will see the rules and see a preview of what is about to happen. I absolutely do not want users to half-accidentally fire something off and discover a bunch of decisions were made that they didn't want. >>17345 Sorry for the trouble. It felt like this was happening less these days, so it is a shame to see it back. It is a false positive. All the source is on the repo, and the build is created on the github cloud itself using this script: https://github.com/hydrusnetwork/hydrus/blob/master/.github/workflows/release_win.yml If this keeps hitting you, I recommend running from source: https://hydrusnetwork.github.io/hydrus/running_from_source.html
>>17349 Shame, if so. Feels like we are in an inflection point in a bunch of ways right now. If more and more of the imageboard community is squeezed out by the big guys and/or pressure of AI rippers, I don't really know what the future will be, whether that is a new community in a new place or a radically different set of protocols, or what, but I'm happy that hydrus, with its 'local only, no cloudshit' autism provides something of a backstop. >>17353 No worries. Windows has some funny rules about drag and drops between processes, usually in regards to programs that are not in admin mode DnDing to programs that are. Same deal with the clipboard and global shortcut hooks. If you are still trying to figure this out, on the Firefox side at least, if the main FF process is running as admin but the individual page processes or UI process(?) aren't, I wonder if that could be blocking the tabs through some convoluted rule. Or some Window Manager/UI-styler with the same thing. >>17354 >wildcards aiiiiiiieeeeeeeee I may be able to do this in future, but I can't promise anything now. I was thinking of it as I was developing the new search routine behind this predicate, and it'll be doable, but a ton of code cleaning may have to happen first. I was lucky in that the efficient tag mappings search for all sorts of domains was already done, so I just had to adapt it for the new predicate and add a bit of slow stuff for deleted and petitioned, but the wildcard tag definition search is not so pretty, and so implementing this in a way that doesn't take forty-eight minutes to search the PTR may need some prep work. I'll note it down, please remind me if three months pass by and nothing happens. Hard sibling replace is something I am thinking about seriously. It will start as a janitor-only ability on the PTR, but I'll extend it to normal users for local tag domains and we'll see how it goes and maybe extend it to normal users on repositories too.
(660.90 KB 780x585 f7tgb_wxuaaagr_.png)

>>17362 >Hard sibling replace is something I am thinking about seriously. considering the PTR is currently north of 87GB large – far closer to 100GB+ – the PTR really could use that ability, if not the ability to replace the mappings of one hash with another via multiple reference. Consensus of duplicate replacements could be used for that, uploaded (probably opt-in) by users. It shouldn't need any revealing information, just "I'm replacing the mappings of hash #1 with hash #2" if both are in the same public database's domain, no worse than uploading tags. Enough of those, and you can replace delete hash #1's tags and have it reference hash #2's tags. Given how many duplicates there are, it could prune a lot of the database.
Think I found a bug when exporting files and using json sidecars with the export folder in v613. I've set up an export process that uses routers for tags and urls and adds it to a json sidecar, but it seems that only one of the metadata routers is added to my json sidecar. The regular "export files" dialog seems to work just fine though, all the routers are added to the sidecar as expected.
>>17347 >>17348 >>17361 I force-reinstalled it with winget and it's working fine for now.
>sankaku I'm getting error 403 on search pages when using the one from the repo and pasting daily auth tokens like I always did.
>>17358 Because it allows big filesizes for long videos while allowing stuff that's outright banned or too low quality for most video-specific hostsings, and it has user defined tags for easy navigation.
>>17368 What a pain Hydrus stopped working for it like a year ago and now even that shitty gallerydl wrap isn't working
>>17369 the v2 api downloader worked fine until this week, just required lots of manual babysitting because auth keys and ddl urls expires all the time.
>>17370 >the v2 api downloader worked fine until this week, just required lots of manual babysitting because auth keys and ddl urls expires all the time. Well yeah, might as well not work, pointless to automate hundreds of artists if you need to babysit the downloader. >
>>17349 Gelbooru has been down for 7+ hours.
>>17372 they said it's a routing issue
Hey hydev, just got a sudden Hydrus while using sway (Wayland), looks like it's due to an MPV error expecting X11. v612, 2025-03-16 21:34:53: A media loaded in MPV appears to have had an error. This may be not a big deal, or it may be a crash. The specific errors should be written after this message. They are not positively known as crashy, but if you are getting crashes, please send the file and these errors to hydev so he can test his end. Its hash is: 530d68237026581637b578bab5f0f1b851c2033548ebc107dd60537fda48f43c v612, 2025-03-16 21:34:53: [MPV error] vo/gpu/x11: X11 error: BadDrawable (invalid Pixmap or Window parameter) I will be able to test more about it later, no time right now.
Thanks for recently adding the "is not filetype" function. Helps immensely since some filetypes are disabled right now due to an attack on the site.
>>17374 >sudden hydrus Uhh, sudden hydrus *crash*, I think it happened as soon as I clicked the file, I tabbed away for a second.
>>17376 again Just clicked on an animated gif while looking for a meme and hydrus froze. It's almost certainly a sway/wayland issue, I've never had hydrus dying on every video before. Its hash is: 3959d075181c1b1963127191349d2eb86756d012c96e1d93543d6608d1272dc3 v612, 2025-03-18 12:54:08: [MPV error] vo/gpu/x11: X11 error: BadDrawable (invalid Pixmap or Window parameter) v612, 2025-03-18 12:54:08: [MPV error] vo/gpu/x11: Type: 0, display: 0x7f344c0040d0, resourceid: 69, serial: 16 v612, 2025-03-18 12:54:08: [MPV error] vo/gpu/x11: Error code: 9, request code: e, minor code: 0
>>17377 More tests: Trying with dbus-run-session: X Error of failed request: BadWindow (invalid Window parameter) Major opcode of failed request: 4 (X_DestroyWindow) Resource id in failed request: 0x6200002 Serial number of failed request: 52 Current serial number in output stream: 55 v612, 2025-03-18 13:02:09: QObject::killTimer: Timers cannot be stopped from another thread v612, 2025-03-18 13:02:09: QObject::~QObject: Timers cannot be stopped from another thread I also tried Xwayland but it didn't start at all.
>>17356 >tbib The problem is with TBIB's search. Their search has been borked for ages, and they apparently don't care about fixing it.
>>17374 >>17376 >>17377 >>17378 Are you running from source?
>>17293 Do you use downloader siblings to "filter" known downloader tags? For example: "series:genshin impact" -> "game:genshin impact" I don't use the PTR either, but personally I get a lot of mileage out of downloader-only siblings
>>17382 at the bottom of the hydrus_client.sh file, try replacing python hydrus_client.py "$@" with: env --unset=WAYLAND_DISPLAY --unset=QT_QPA_PLATFORM QT_QPA_PLATFORM=xcb python hydrus_client.py "$@" that might help if the issue is related to interactions with Qt. just a shot in the dark though
>>17381 >Do you use downloader siblings to "filter" known downloader tags? For example: >"series:genshin impact" -> "game:genshin impact" Nope, with one exception for yokai watch/yo-kai watch or whatever because I keep misspelling it myself.
>>17383 No dice but it at least segfaults instead of freezing. The coredump says the error comes from PySide6...libQt6Gui.so.6 Here's the stack trace, I don't know shit about debugging but the ??? entries are a little odd, I was suspicious of them at first but I see that libc has it too so I guess it's fine. Is that what you see if you don't have debugging symbols? #0 0x00007f59f87d1110 in ??? () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Gui.so.6 #1 0x00007f59f89ba6ad in QPainterState::QPainterState() () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Gui.so.6 #2 0x00007f59f8998a2d in QRasterPaintEngine::createState(QPainterState*) const () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Gui.so.6 #3 0x00007f59f89bfb52 in QPainter::begin(QPaintDevice*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Gui.so.6 #4 0x00007f59f6b9ee62 in ??? () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #5 0x00007f59f6b9d3b1 in QMainWindow::event(QEvent*) () at /venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #6 0x00007f59f6a0f632 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #7 0x00007f59fbc6ad3a in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #8 0x00007f59f6a5ee26 in QWidgetPrivate::sendPaintEvent(QRegion const&) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #9 0x00007f59f6a5f62e in QWidgetPrivate::drawWidget(QPaintDevice*, QRegion const&, QPoint const&, QFlags<QWidgetPrivate::DrawWidgetFlag>, QPainter*, QWidgetRepaintManager*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #10 0x00007f59f6a72245 in QWidgetRepaintManager::paintAndFlush() () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #11 0x00007f59f6a6761c in QWidget::event(QEvent*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #12 0x00007f59f6a0f632 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Widgets.so.6 #13 0x00007f59fbc6ad3a in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #14 0x00007f59fbc6e24d in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #15 0x00007f59fbf24223 in ??? () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #16 0x00007f59f9521992 in ??? () at /usr/lib64/libglib-2.0.so.0 #17 0x00007f59f9524dc7 in ??? () at /usr/lib64/libglib-2.0.so.0 #18 0x00007f59f9525570 in g_main_context_iteration () at /usr/lib64/libglib-2.0.so.0 #19 0x00007f59fbf23afa in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #20 0x00007f59fbc76573 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #21 0x00007f59fbc7300e in QCoreApplication::exec() () at hydrus/venv/lib/python3.12/site-packages/PySide6/Qt/lib/libQt6Core.so.6 #22 0x00007f59f74ebcb5 in ??? () at hydrus/venv/lib/python3.12/site-packages/PySide6/QtWidgets.abi3.so #23 0x00007f59fffcf284 in ??? () at /usr/lib64/libpython3.12.so.1.0 #24 0x00007f59fff81d13 in PyObject_Vectorcall () at /usr/lib64/libpython3.12.so.1.0 #25 0x00007f5a0007aaa0 in _PyEval_EvalFrameDefault () at /usr/lib64/libpython3.12.so.1.0 #26 0x00007f5a0008287c in PyEval_EvalCode () at /usr/lib64/libpython3.12.so.1.0 #27 0x00007f5a000d7576 in ??? () at /usr/lib64/libpython3.12.so.1.0 #28 0x00007f5a000d7675 in ??? () at /usr/lib64/libpython3.12.so.1.0 #29 0x00007f5a000d7780 in ??? () at /usr/lib64/libpython3.12.so.1.0 #30 0x00007f5a000da6cb in _PyRun_SimpleFileObject () at /usr/lib64/libpython3.12.so.1.0 #31 0x00007f5a000dadc0 in _PyRun_AnyFileObject () at /usr/lib64/libpython3.12.so.1.0 #32 0x00007f5a00100ecc in Py_RunMain () at /usr/lib64/libpython3.12.so.1.0 #33 0x00007f5a00101418 in Py_BytesMain () at /usr/lib64/libpython3.12.so.1.0 #34 0x00007f59ffcdb22e in ??? () at /lib64/libc.so.6 #35 0x00007f59ffcdb2e9 in __libc_start_main () at /lib64/libc.so.6 #36 0x000055f117e62095 in _start ()
>>17385 >Is that what you see if you don't have debugging symbols? I actually don't think I've seen this error before myself, so I have no idea here. I have one more idea though that might help. Keep that "env" part that you added in the startup script, and in your mpv.conf (the one used by Hydrus) add this config option: gpu-context=x11egl Remember to actually go into Hydrus to update the config under the "media playback" settings, since I don't think Hydrus updates it automatically. (could be wrong though)
>>17386 Still nothing, don't feel too bad about not solving my issue anon, like 50% of the programs I use are having issues in wayland. I'm only using it because Xorg is having major problems, and I now know that it is an X issue (great). X Error of failed request: BadWindow (invalid Window parameter) Major opcode of failed request: 4 (X_DestroyWindow) Resource id in failed request: 0x2400002 Serial number of failed request: 52 Current serial number in output stream: 55 v612, 2025-03-18 19:01:52: QObject::killTimer: Timers cannot be stopped from another thread v612, 2025-03-18 19:01:52: QObject::~QObject: Timers cannot be stopped from another thread
>>17388 wow that seems serious. hopefully Hydev will know more about what those errors might mean then, at least on the Qt side
>>17374 >>17377 >>17386 I am no Linux or mpv expert, but the thing that stands out to me here is the: >v612, 2025-03-16 21:34:53: [MPV error] vo/gpu/x11: X11 error: BadDrawable (invalid Pixmap or Window parameter) line. I've seen trouble before with 'vo', which is documented here: https://mpv.io/manual/master/#options-vo so I think something like >>17386 's 'gpu-context' or 'vo' might be a good tack here. Maybe this? https://mpv.io/manual/master/#video-output-drivers-dmabuf-wayland https://mpv.io/manual/master/#video-output-drivers-wlshm Or take the other end and try 'sdl', which I think is the software-only rendering--if that doesn't crash, then you know 'vo' is the guy to hit. You can edit the mpv.conf in your db dir directly. It reloads in all mpv windows every time you ok file->options. >>17388 >>17390 This specifically is a bit odd and I don't totally understand it. In general, the easiest way to get hydrus (or most UI programs) to crash is to touch UI stuff from the wrong thread. Since UI things are often working with synchronisation-delicate stuff like the GPU, you generally have one ecosystem of threads that are set up to do UI stuff at the correct time and correct place, and non-UI threads have to post events that are then processed in the main UI event loop where it is safe. This error looks like a classic crash in that some non-Qt thread is telling a QTimer to do something. Probably, although I can't be certain, it is calling the C++ level destroy function on the QTimer, so my best guess here is that some external thing is saying something like 'hey, this object/thread/process/something is invalid, kill it now', and that's grabbing a QTimer in there and nuking it, and then it is sperging out because we aren't in the Qt thread, and that seems to be caught by the X Error catcher. It saying X_DestroyWindow is probably part of the whole thing. So, where's the QTimer here? I only use one, really, in hydrus, to do some animation-level timing stuff like thumbnail fades. That's owned by the main gui window and lingers until program exit, so this error could be a complaint on that timer during a forced whole-program shutdown somehow from an external thread, but I suspect instead this is a QTimer that's operating inside Qt itself (down again at C++, rather than my python code) to do one of a million tasks like animating a checkbox click transition or whatever. It may be that some of the mpv stuff works via this, and that's likely since that's what's causing the problem, but I would guess, right now, that this error is more of a symptom of error-reporting that is failing to handle and clean up a problem rather than the cause of the problem per se. The trace in >>17385 , which is all C++ level, seeeems to back this up. It looks like a PaintEvent, which is the most dangerous and crashy time. Probably the main mpv widget (on the Qt side) trying to draw itself to screen, and then it, idk, let's say it is asking mpv about something as it tries to draw stuff in the media viewer (not the hardware accelerated mpv window itself though--I'm pretty confident mpv draws itself to its own window id on its own thread using a different mainloop, which is another tangled nest that can cause crashes I'm sure), and because this version of mpv isn't setting the hardware drawing flag in the way wayland/X/whatever wants, it is raising an exception. Any unhandled exception inside a paint event is basically crash time. I still think the right tack for now is try stuff at the mpv.conf level. That way you are telling C++ stuff how to initialise and talk to other C++ stuff. Get mpv calling different OS-level graphics APIs and we'll hopefully find one that won't go bananas with this specific setup. Let me know what you figure out!
Edited last time by hydrus_dev on 03/19/2025 (Wed) 02:21:14.
I had a great week. Duplicates auto-resolution is very almost done, and for the release there's nice new zoom-locking tools, some quality of life, and the new 'system:tag (advanced)' gets a bit more polish. The release should be as normal tomorrow.
I'm revisiting the hydrus install I used to scrape a bunch of cringe-inducing shit and exploded artist blogs from 5+ years ago (version 417) and I'm noticing something very disturbing: hundreds of files in client_files/f?? with no extension and no apparent file format other than some repetitive byte strings. It's not JPEG or PNG, 'file' says they're all "zlib compressed data" but that could mean fuck all. My mappings.db is also a hot mess but that's not a big surprise considering how much I was fucking around with the PTR at the time and I'm pretty sure I force closed it in the middle of a PTR update a couple times. I just want to find out if this is expected and if not, how fucked I am. I'd upload a screenshot showing what I'm talking about but the acid fatty won't let me.
>>17393 I'm pretty sure you are looking at PTR repository update files, means the files that get downloaded before being applied to the PTR. If you change your file domain to 'repository updates' and sarch for 'system:everything', you will see them (20,401 files, 14.4GB for me). If you can't see the 'repository updates' domain, you need to activate advanced mode first under 'help' at the top. Then you right click one of the files -> open -> in file browser, and it will take you to one of exactly those files. Deleting them after your PTR updates are done, should do no harm. They don't delete themselves i guess and act like a backup.
>>17343 >- Whether we should include siblings or just search for that tag exactly. How about finding tags that are really there and not added as parents?
>>17367 >>17355 api url changed, parsers, downloaders etc need to be updated.
>>17394 Neato. One more question: did the tumblr NSFW purge a few years ago totally fuck the affected blogs or is there still a secret handshake to get at original images? Right now when I look at a particular NSFW blog on tumblex random images are just 400px thumbnails.
https://www.youtube.com/watch?v=a_2VS4NFAXU
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v614/Hydrus.Network.614.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v614/Hydrus.Network.614.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v614/Hydrus.Network.614.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v614/Hydrus.Network.614.-.Linux.-.Executable.tar.zst I had a great week. There's some new zoom-locking tools and quality of life. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights Thanks to a user, we have some neat new zoom tech for the media viewer. There's a new zoom button in the top hover window that lets you set different zoom types like 'fit horizontally' and 'canvas fill', and we also finally have zoom locking, so if you want to browse some media all at 100% zoom or whatever, you can just set that and set it to lock and it'll stay that way as you move back and forth. There's also 'pan locking', to help comparison between similar files, and I added the 'size locking' zoom tech from the duplicate filter, too. As a side thing, the 'eye' icon on the top hover also lets you disable the hover-window pop-in for the tags and top-right hover windows. The new 'system:tag (advanced)' predicate now has a tag autocomplete to make it easier to enter the tag you want, and the predicate now works in the system predicate parser. There's two new 'catmocchin' QSS styles submitted by a user. Check the contents of the files themselves for more ideas on style and other colours. I boosted the default 'memory reserved for image cache' from 384MB to 1GB for new users. If you have never touched the stuff under options->speed and memory, you might like to go in there and make sure your basic image cache amount is decent, especially if you get laggy image load on >4k images. 1GB is very good for most situations. There's an annoying bug in the media viewer where images will sometimes start to take two frames to zoom--one where they move to the new position, then the second, flickery frame where they resize. I discovered how to reproduce it and think I have it fixed! next week I did final integration on the duplicates auto-resolution system this week, and everything worked! I pretty much just have to write help documentation and unit tests for it, so I will focus exclusively on this and aim to launch the simple initial test for advanced users next week. If you are interested in this system, I enabled all the edit UI this week so feel free to go bananas and see what happens. All users see it now, too--it is a third tab on any 'duplicates' page. The dialog will not save any rules back, so it won't do anything, but you can see what I'm thinking of. Let me know if anything breaks!
(50.11 KB 499x376 woah crash.jpg)

>>17401 You're a fucking wizard, man.
Suggestion for duplicate processing. When I have large files (10+MB and 2000+ width or height) and I need to compare them in the duplicate processor, or just when moving forward or back in the media viewer, swapping between the files will have a delay, and during this delay the display is blank. This can make it difficult to spot minor differences. Can the first file linger until the last possible moment as the new one is loading?
Not really a hydrus question, but does anyone know if there a fine arts booru? Like a booru for classical paintings and whatnot?
I'm trying to set up hydownloader but can't get it to work over vpn or proxy. If I enable proxy in settings, in test run logs I get >Missing dependencies for SOCKS support. I tried installing requests[socks] as mentioned in gallery-dl docs but it did not help. As for VPN - I'm using a client with split tunel support, do I have to point it at the entire python.exe ? Also importing known IDs from Hydrus did not work, but that may be because of sankaku url change. Or am I doing it wrong? The importer ran successfully and displayed matching number of entries.
>>17401 I started looking at the UI for the dupe auto-resolution and I have a suggestion. I think you should change the wording for the comparison rules so that, instead of "A will pass:" or "B will pass" you should say "A will match:" or "B will match:". saying "pass" confused me and made me think that the one that gets that rule will be the one that's counted as the better dupe. "match" makes it more clear that it's a check that the file matches the pattern.
Just updated from v611 to v614. I accidentally zoomed in a video 2 times quickly, and it instantly froze Hydrus for about a minute with my cpu running a bunch, then Hydrus crashed. it only gave this one error in the log: python3.12: ../src/x11/x11-window.c:2344: eplX11SwapBuffers: Assertion `pwin->current_back->status == BUFFER_STATUS_IDLE' failed. I tried it again, but this time I zoomed in slower and 3 times, and it crashed again. It didn't do this before.
>>17427 I can't reproduce it with images, so I'm guessing MPV is involved with the error
>>17401 >Thanks to a user, we have some neat new zoom tech for the media viewer. I've been missing that! Have to resize the window with "Let's review our rule to make sure it makes sense" every time for the third column's head to fit, and the "edit rules" window once.
>>17289 (sorry I've been busy) >But wait, now I see a real URL here, I see the 'f' you mentioned, and that it is already being stripped unfortunately, I looked into it, and it's not being stripped. it only shows up when you view the file directly from kemono.su normally, but the downloader uses the api, which doesn't have the f parameter as part of the file url. that being said, I just used a regex substitution to recreate the parameter and attach at as the url to download. I hope you're right and that it works, but the downside here is that there's gonna be a whole bunch of duplicate known urls now, that only have an unnecessary parameter different between them. it's gonna make kemono.su urls messy unfortunately. but if it works, it works. I do hope that Hydrus eventually gets some sort of "metadata merge" for the downloader's file urls so that this hack isn't necessary anymore, but as long as this allows the tags to properly be added to all files (I'll hopefully see if it does when my subscription has updates) then it's good enough for now. Thanks for the help!
>>17401 It looks like the setting for making rules are just ordinary searches. is it stilled planned to allow you to compare the 2 files directly in rules, like "A has >=10% larger filesize than B" or something like that, or was that postponed? if you already mentioned it then I must've missed it. I hope it comes eventually because a lot of the rules I wanted to make would involve comparisons, but this still looks like it'll be pretty useful as is. pretty cool!
Super anecdotal, but just switched to Wayland (niri); I'm starting Hydrus with Xwayland plus the suggested `QT_QPA_PLATFORM=xcb`. I've switched to the native hydrus viewer for videos as mpv really sucks in my case (it creates two separate windows that don't go away and stay black after playing, not sticking to the main client or the media viewer). I'm happy with the native hydrus viewer and that setup seems to work pretty well for me, so thanks for providing that alternative hydev!
>>17363 We'll see how it all goes as these tools slowly roll out, and faster duplicate resolution. I added an advanced 'purge tags' tool for janitors last year, and that went well, although I'm sorry to say the absolute savings of these things feels like it is in the 1-5% range at best, so while it is nice for the human experience to clean up garbage, it isn't earth-shattering on a storage level. Unfortunately, also, for space-saving purposes, the PTR system doesn't yet have a way to completely divest itself of a tag mapping--when we 'delete' something, I just move its record to the master 'deleted' table. There are good reasons for this--to make sure that the bad tag is not re-added by the same misfiring parser or whatever later--but we'd like an option for a complete space-saving purge and/or orphaned-id cleanup. I'm thinking of these systems, and I expect in the coming years the PTR will get massive overhauls. The growth is delightful but unsustainable. >>17365 Thank you, sorry for the trouble. I will check this out this week or next. >>17366 Great. I am not an expert, but I believe most of the false positives here come from 'Smart Screen' or 'Automatic sample submission' or whatever MS call it, which is their cloud submission system that delivers real-time threat definitions. Afaik, when you run an (unsigned?) exe, MS sends a hash or profile or something of the exe to their cloud, and that process can result in the confident false positive report. When users (and maybe user behaviour?) report that it is a false positive, the cloud learns that that hash report was a false positive, so like ten days after release, Windows Defender no longer gives the report. The data they learn through all that, I assume, becomes the regular definition updates they actually push to Windows Update. Thus, although I don't think it is fantastic to recommend, turning off Smart Screen cloudshit under the Defender options--I just looked and it looks like the call it 'cloud-delivered protection' and 'automatic sample submission' now(?)--tends to reduce these false positives since your Windows isn't asking the realtime-yet-error-prone cloud system about the new hydrus exe. But I guess if you turn that stuff off, you aren't getting it when it does fire off correctly. >>17393 Sorry for the confusion. I'll write an FAQ for this--it has come up before. If you use zlib to decompress them, you'll see it is just giant JSON lists of numbers and tag/file definitions. >>17398 Pretty sure it isn't widely available in any way. I think for a while they let accounts see their own nsfw posts, so people could backup and stuff, but I don't know if that is still true. If you see some cached thumbs, maybe it is, or that could just be some external image host remembering them. They changed their policy again after another more recent change in ownership, I think, as well. I think you can have artistic nudity, gravure stuff, again, but nothing hardcore. I think I got my hydrus tumblr in ~2006/7 (EDIT: looks like June 2008), because they had the hottest post editor at the time. Would have never guessed all the changes since. You may well know, but it used to be where all the early sfm creators posted crazy stuff; you'd never think it today. I'm amazed it is still running, as I'm not sure they've ever had a profitable quarter. I'm expecting it to just shut down one day. >>17395 Thank you for mentioning this. I changed this for v614. I forgot that the storage/display tag domain dichotomy has the parent-gap also.
Edited last time by hydrus_dev on 03/22/2025 (Sat) 19:16:28.
>>17403 Thanks; had a good few weeks recently. We can die any day, so keep on pushing. >>17423 Yeah I'd like some better render tech here. I don't like how ugly the media viewer is in the intermittent moments, and it'd be nice if it could wait on a media transition until the next one was ready. Perhaps I'll be able to do that when I finally get around to overhauling the awful layout system that backs it atm. For now, your best bet is to boost your 'image cache' sizes under options->speed and memory. I increased the default values here for new users in v614, so you might like to go in there and double-check you have a decent amount yourself. 1GB cache size and letting a single image eat 25% of that allows for a ~16,000x9,000px image to stay in the cache. If a file is in the cache, it should load instantly when you flick between two images. Make sure you don't have an accidentally anemic 'image tile cache' either. 256MB tile cache is great. >>17425 I know very little about hydownloader, but I have to add 'pysocks' to the hydrus requirements.txts to (iirc) get requests SOCKS stuff working. I dunno if 'requests[socks]' does that too, but you might like to try it. >>17426 Thanks, great idea. >>17427 >>17428 Thank you. I changed the media viewer to do 'setposition and setsize' on the actual media window in one call this week, to stop a flicker issue. Sounds like X11 doesn't like that for mpv. Sorry for the trouble--I will make a DEBUG option to go back to the old behaviour for mpv for next week. Let me know if that fixes it, and if it does I'll make it default for Linux or something. >>17430 >Have to resize the window with "Let's review our rule to make sure it makes sense" every time for the third column's head to fit, and the "edit rules" window once. Thank you, will fix!
>>17431 >I do hope that Hydrus eventually gets some sort of "metadata merge" for the downloader's file urls... Yeah absolutely. We'll need a lot more tools around this as duplicates auto-resolution takes off, too. I want a bunch of en masse URL management tools too to do 'hey convert/merge all these legacy http urls with their modern https equivalents' and so on. It'll just need an overhaul of URL storage&search, and some UI. >>17435 Yep, exactly. The first test for v615 is just going to be pixel-perfect jpegs and pngs, and I focused on getting the platform and workflow working solid. We'll make sure the system works at scale on that trivial and unobjectional rule, and once happy I can then add more levers and handles to the system. The 'A has x property' test is simpler than an AB, so I did it first, but I absolutely will figure out an object to handle 'A is at least five times larger than B' and so on, and I'd really like to do it like the 'A has x property' system where it works with existing system predicate tech in some way, so I'm not writing all new UI and hardcoded comparison blah for every new testable variable. This is inside baseball talk, but I'm thinking of adding a call to system preds where they eat a media and give up a number related to their type. So for filesize, they'd eat up a file and produce the num_bytes. Then, with minimum effort, I can have some generalised comparison UI and backing object like: A SYSTEM PRED | OPERATOR | MULTIPLE | B SYSTEM PRED Or maybe just the one system pred, I dunno if you'd ever want to compare A's height with B's width or something, but maybe that's something to think about later. Anyway, you'd say "A system:height | >= | 5x | B system:height", and then all I have to do in future weeks is add the simple 'hey yeah I can produce a number' method to other system pred types, and we magically get new tools with the same UI. Simple start, and iterable growth of new tools thereafter. We might figure out some more tightly coupled file-to-file comparison tech like 'A is 99.7% pixel-similar to B', but I'd like to see how the IRL works and we can get a real feel for what we are actually missing and what is worth dumping time into. If we can eliminate 80%+ of all the relatively easy same-file duplicates automatically within twelve months, I'll be delighted. >>17436 Sorry for the trouble here. I have heard this new wayland stuff is causing headaches all over the place, but I've also heard that it is shaking the rust off of bad old legacy systems, and I'm pretty ignorant about all Linux politics, so I hope it all just sorts itself out in a year or two and Qt and other players can adjust to whatever the new rules are. The hydrus mpv embed is amazing tech that I've butchered into shape in my crazy ass python Qt situation with duct tape. It has never worked in macOS. I am aware of another way of embedding it--iirc it is basically me spawning an OpenGL window and asking mpv to render to that, whereas atm I ask mpv to make its own window and attach it to my own window handle--which I'd like to prototype sometime as a more compatible alternative. A side thing as well--the native renderer is inefficient atm since it literally calls ffmpeg on the command line for a raw render on stdout, but there are one or two python libraries that may allow us to call ffmpeg in dll form, in our own process, which should reduce the overhead and overall jank in this system. It is something else we are thinking of as a possible project to try. And maybe one day I can figure out doing a 'software' audio renderer. We'll see how it all goes.
Really simple request; for Drag-and-Drop renaming in Exporting, could there be an option to choose which hash it defaults? The absolute majority of sites and boorus default to md5 hashes, not Hydrus' preferred sha256. I'd like it to set to export as md5 by default.
In file import options, can we get an option "if file already exists in database, overwrite its import date to now"? There are some situations where that will be useful.
Alright boys, been using Imgbrd grabber for a while but I'm getting tired of how it's getting buggier and buggier every update and they never fix videos. Tried to use Hydrus a month or two ago but all the words filtered the fuck out of me, and I feel like trying again today. Doing my best using the WORDSWORDSWORDS guide, but I still gotta ask, any ways to transfer bookmarks/favorites from that software to Hydrus? And anyone has a CSS/Style sheet for dark mode? When I use the color change thing it's still kinda half-white even with CSS override.
>>17449 never heard of imgbrd grabber so I can't offer help there, but assuming that bookmarks and favorites are something you can export to a file, Hydrus has a sidecar system for importing that can use sidecar files to import files with metadata. that might be able to help, but I never used it before because I never needed it. >darkmode go to "options → colors → current colorset" and change it to darkmode, then in "help" enable darkmode. I don't know why you have to set 2 different options, and I don't know why the darkmode checkbox is under help, but that should do it.
>>17450 >import files with metadata Yeah that one somewhat works, I'll see what I can do with that. >colors That one actually didn't work, but I managed in the styles tab by choosing dark blue. Thanks.
any ways to just browse images in this thing or is it always just automatic download of everything for a tag?
>>17452 what?
>>17453 any way to just look at the images and then decide what i want to save when i look up a tag or am i forced to dl everything and then delete what i don't want? like if i write artist name, am i forced to get everything from that artist or can i look up their shit and then decide what i want to dl
>>17454 no, you download everything, that's the entire point either write search requests for what you want or get an extension like Hydrus Companion that lets you pick and choose what you download
>>17451 Man, the import stuff is pretty nice but I wish I could figure out how to tag my old images better. Seems like I can put the folder they were in as metadata but it would've been more convenient if I could have set that metadata as artist tag instead since all my images were in individual artist named folders to begin with, would've saved a lot of work. Metadata artist name will have to do I guess.
>>17456 When you run a folder import you can add quick namespaces like "the second to last folder is artist:xyz" so if your path is /home/anon/pics/vincent van gogh/blue period you can automatically add "series:blue period" and "creator:vincent van gogh". If you already imported the images without deleting them you can re-import them with that rule and it'll add the tags. If you deleted them, you can use siblings to make them cleaner. Right click a tag > no siblings > add siblings to tag > leftmost box is "vincent van gogh" & rightmost box is "creator:vincent van gogh", that will automatically set all existing and future "vincent van gogh" tags to "creator:vincent van gogh". I'm using the standard hydrus tag namespaces but obviously you can use your own, it's your system.
Hi is it possible to add thumbnail support for archive file types? (.zip, .rar, .cbz, etc.) It can be as simple as reading a .thumb file if it exists inside the archive. I would love that since I can use hydrus to manage my texture files. Would help with stuffs like photobooks/artbooks/manga too.
keep up the good work dev, local booru was the missing piece to my self hosted paradise
>>17457 That worked pretty good, thanks. I had tried earlier but was confused because I wrote artist: but turns out you need to do creator:.
When sorting by a numeric rating, descending, with a limit, all the files on top have 0 rating.
>>17454 Install Hydrus Companion add-on in your browser, browse the website as usual and send the images you like to Hydrus through the companion add-on.
big n00b here, starting my selfhosting journey. Got a DS1515+ for free and I'd like to run hydrus on it so I have a 24/7 personal local only booru running. I tried to install it via docker but I'm too stoopid to figure it out. I got the suika/hydrus image from the registry, created a container and the log make it look like it's working but I cannot access it from the browser with http://yourdockerhost:5800/vnc.html or http://yourdockerhost:5900 using tiger VNC so obviously I'm missing something stupid and obvious.
>>17437 >Smart Screen I think that should only check exe on launch. I had hydrus running, downloaded some stuff, refreshed the inbox and got force-closed and client exe deleted with no recovery. Seems more like behaviour pattern recognition rather than simple hash popularity check. Though after reinstall all works fine with the same database same images etc.
Am I tripping and using the wrong blacklist or it straight up doesn't work when I download stuff? Had blacklist from tags -> manage tags in the all known tags & downloader tags and copy pasted my e621 blacklist and I still see vore/scat/etc being downloaded. The only thing the blacklist seems to do is remove the tags from the images. The only thing I can understand not working is the artist names since I imagine I'd need to add "creator:" at the start of every artist I want blacklisted because lol syntax but I don't get why the rest is still popping up.
>>17449 I always found hyrdrus to be a lot better when you take things more slowly and sort of ease into it. Don't just dump your whole collection and expect things to word out of the box. That's why I always recommend new users to only try a few images at a time just to get the hang of it. Keep doing what you've been doing but keep hydrus installed and always on your mind while still playing around with it until you're fully ready to commit. Definitely not brainlet friendly but goddamn amazing when you set it up the way you want it.
>>17472 >I always recommend new users to only try a few images at a time just to get the hang of it. This.
I had an excellent week. The first version of duplicates auto-resolution is ready for advanced users to try out. There's not much else in the changelog! The release should be as normal tomorrow.
>>17469 >Had blacklist from tags -> manage tags in the all known tags & downloader tags and copy pasted my e621 blacklist and I still see vore/scat/etc being downloaded. That section only controls display. What you want to do is go under "network" > "downloaders" > "manage default import options". Then click the "import options" button next to "default for file posts". Put your blacklist in "set file blacklist". This blacklist is will apply for all websites. You can set blacklists on a per-website basis in the section below.
>>17478 Thanks a lot. Some of these settings are put in some really dumb locations, legit spent an hour looking up on all the documentation and discussion to find where the correct blacklist is and it's always either the wrong thing or outdated images.
https://www.youtube.com/watch?v=aqzzowQcPZ4
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v615/Hydrus.Network.615.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v615/Hydrus.Network.615.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v615/Hydrus.Network.615.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v615/Hydrus.Network.615.-.Linux.-.Executable.tar.zst I had an excellent week. The duplicates auto-resolution system is ready for advanced users to try out. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html duplicates auto-resolution Any user can run this test, but I'll ask that you only go for it if you have a bit of experience with processing duplicates. Make a backup before you update, then go here: https://hydrusnetwork.github.io/hydrus/advanced_duplicates_auto_resolution.html Make sure your default duplicate merge options for 'better than' do not say 'archive both files' and then try out the jpeg/png rule. I want to know: if it all makes sense to you; where it runs smooth; where it runs garbage; do you get any errors. I'm also generally interested in your numbers. Out of ~800k potential pairs, I had ~6,000 jpeg/png pixel dupes, resulting in ~4,700 actual pairs (numbers shrink because multiple good files can share the same bad file). The preview panel ran pretty bad, about 30 seconds to load numbers, but the actual work was about 20k files/s for the search stage and 10 files/s for resolution. I cleared about 7.5GB of Clipboard.png in seven minutes! Thank you for trying it, and thank you for your patience as I figured all this out. next week Back to normal work, and I'll plan out future extensions to auto-resolution. Assuming the test goes well, I'll open up all the UI and invite everyone to try it out in v616.
Did the e621 tag importer broke again? I can no longer download any tags other than rating.
>>17484 https://files.catbox.moe/s4nzrh.png have to upload via catbox because "file format not allowed" on upload actually fixed, e621 changed tag format and added some shitty fucking "uploaded by artist" twitter badge in svg in the creator span that has to be filtered out note: this filters "contributor" to its own namespace, removes "creator:sound warning" because it's a trash tag and changes "creator:third-party edit" to "meta:third-party edit" via regex substitution.
When running version 615 on Debian 12, the program gives an error symbol lookup error: /home/%my user name%/Downloads/Hydrus Network/libQt6WaylandClient.so.6: undefined symbol: wl_proxy_marshal_flags what can be done? Try the Flathub version?
>>17482 so, I did this, went there add suggested, apply, and missed out on some of the cool stuff that was also there because I was half paying attention. now, my thoughts on this process. 1) I like that it automatically does everything and that's about it, what I dislike 1) that it automatically does everything let me try to explain, I had a bit over 19k pixel perfect matches. it went down to about to about 26,000 images before it decided to not do anything, and I had to hit work harder... not sure why, what I would like to see happen is instead of auto deleting them (well moving them to the trash) I would like the program to effectively earmark these, and let me decide on how to resolve them or when to resolve them, more or less, the program knows they are duplicates but lets me review them if I want to, I personally see this as a massive thing when its not as obvious as jpeg/png pixel perfects, given how complex this seems, double checking a few of these to make sure it was set up right before I press forward would be nice. now, because I was a dumbass and skipped the interesting things, I have to refer to the images on the help page, again well aware i'm a dumbass, but the image with the preview is interesting to me, the one that shows the images side by side in previews would it be possible to do something like that will the current duplicate filter? maybe as a companion to the large full image view? my use case is when I have an image set that may be 10+ images deep and all of them are minor variants, or when the differences are fuzzy enough that even larger differences still pop up as duplicates, hell, I have several thousand pngs that are just shitposts someone made that downgrade image quality while keeping a large file size and are seen as duplicates in the filter, easy for a human, even from thumbnails to figure it out, but would take quite a bit of time loading up one image over the other. overall I like this, but instead of fully auto resolve, can we have the program parse it, and then let us open it in some way to make damn sure its doing what we want before we ok the next action?
>>17489 Nevermind. "Fix" this by using Ubuntu
mpv added HDR support for Wayland. https://www.phoronix.com/news/MPV-0.40-Released
Hi! I'm having this rare but still intermittent issue where the api just stops responding. The issue persists until restart. Windows platform. Usually it happens after I try to access videos via the api in quick succession. How would I even go about finding the reason for this? The "api test" option in the debug menu has no effect until the restart as well.
>>17484 Ah, I thought I had broken something myself, glad I checked the thread.
For duplicate auto-resolution rules, I hope to see fuzzy filesize comparison (A's filesize is within x% of B's filesize), comparison of image resolution and comparison of aspect ratios. For JPEGs, also quality/compression profile. I think having these types of rules available would make a lot of work with variant sets automatable, and even be able to reliably detect obvious better/worse pairs.
>>17488 >removes "creator:sound warning" because it's a trash tag and changes "creator:third-party edit" to "meta:third-party edit" via regex substitution. Why would you do that? If I wanted those changes I could do them non-destructively with display filters and siblings.
>>17501 Because I cannot imagine a single situation where having "creator:sound warning" as a tag would be useful. In fact, the tag is far more likely to be wrong than the built-in "has audio" file property.
>>17502 I don't care about your lack of imagination.
>>17502 it's useful if you don't want your ears blasted by a sudden loud noise. what's hard to get about that?
(30.33 KB 500x430 group of nuns.jpg)

>>17503 Then shut the fuck up and remove it from the parser filter for yourself. Here, I'll even give the exact path: Network -> downloader components -> manage parsers... -> "e621 file page parser" -> Edit -> content parsers -> creator tags -> Edit formula -> Big string processor button -> select "TAG FILTER: ..." -> X - > Yes -> Apply everything. >>17504 Then use the "system:has audio" predicate? Which does a better job? Out of 394 files with "creator:sound warning" in tags only downloaded from e621, 4 of them don't actually have any audio at all, and out of the remaining 527 files with audio, there's many that could be considered having "obnoxiously loud sound". I'd take it as a good, valuable tag if it were mechanically applied, like if the e621 website detected any sound past 30dB (arbitrary) on upload and added the tag (which shouldn't be under 'artist' anyway) but it's clearly not. To be fair, on the same logic I should also filter out the other various file tags like "meta:sound" or "meta:webm" because they're all useless, I actually forgot they even existed.
>>17506 If you're posting something on the internet for people to use, try to be user-friendly instead of enforcing your own arbitrary biases. Imagine if I posted a e621 parser that got rid of the species namespace and just moved them to unnamespaced, and then said "hurr durr i cannot imagine a situation where you would want the species namespace". You can see how some people might be annoyed by that, right? Especially in the hydrus thread were people are autistic about tags. I just don't see any reason to arbitrarily change the results to be different from what's actually on the page.
>>17504 If that's what sound warning was actually used for I'd agree.
>>17488 I'm dumb, explain this. To fix e621 tags not working I need to remove creator sound warning? I checked the catbox as well but it's just an image...
(33.23 KB 888x532 1.jpg)

>>17504 >it's useful Sure, but only for you. Imagine if much of these moot feature requests are fulfilled, the software would be so bloated and slow that anons would be demanding for a Hydrus Light version. What about if you try the Mute icon before you load the videos instead?
>>17511 >Imagine if much of these moot feature requests are fulfilled what feature request? the reason they have the tag on e621 is to let people know that the video has a sudden audio spike. nothing about hydrus here so I don't know what you're responding to >What about if you try the Mute icon before you load the videos instead? because that just... mutes the audio?
>>17511 >there shouldn't be a jumpscare tag in any context >just like, don't play the video
>>17510 The image has the downloader data inside of it. Download the picture and put it somewhere you can find. Go to hydrus>network>downloaders>import downloaders. Click on Lain's face and find the image, it will tell you what you are importing, click yes and now the parser's been added. Now go to network>downloader components>url class links, find e621 file page, edit and set it to the new one, which should be e621 file page parser (1), matching parsers should be at the top.
In case anyone used the e621 with notes variant I updated it to work with the new parsing thing. I didn't change the sound warning stuff because I didn't feel like it. No image uploads so here it is: https://files.catbox.moe/t6jw3w.png
>>17512 >because that just... mutes the audio? Exactly, then you adjust the volume located right on top of the Mute icon just before to click on a video. Not a big deal, I think.
any ways to make some images children/parents of others? i'm seeing the children/parent tag thing but it doesn't seem to correlate
>>17526 No, not yet, you can set them as alternates*, which just indicates they are related, it does not have any sort of hierarchical info. I think parent/child images might be planned but I might be making that up. *select 2 or more images>manage>file relationships>set as alternates.
>>17447 Great idea, thanks. I'd still like to completely rewrite how export filenames are created. We want something richer than the current template string, and we need better fallback logic for when some data is missing. I'll write this down and think about it. >>17448 That's an interesting idea, thanks. I'll write it down. >>17458 We do it for cbz now, although my cbz detection algorithm is a bit shaky, so if your cbzs are read as zip, let me know. I think it would be neat for normal archives to pick a decent thumb, although I know some users don't want it because many results will be unhelpful (imagine if you were storing lots of technical brush zips and the thumb every time was just a 20x20 white square or other weirdness), so I'd introduce it as optional for arbitrary archives, I think. We have zip reading right now (python comes with it), but I don't think we have any tech to read rars or 7z yet. I think it is fairly easy to add, but I haven't looked into it properly yet. I'd like thumbs for cbr and is it called 'cb7'? I generally want more archive-parsing tech in future. You mention a .thumb file--is that something you see often? Is that a standard in a particular .zip format you use? Are there any others I could look for? My general approach here has been to list all the .jpgs and .pngs and just pick the one with the earliest filename, since that works for cbz. As I (slowly) roll this stuff out, let me know how it works for you! >>17459 Thanks, I am glad you like it. If you are a new user, then, in a few weeks, once you are comfortable with things, let me know what has been difficult and easy to learn. Keeping the help guides updated and overall UI easy for new users to pick up is a constant battle. >>17462 Thanks. I am afraid 'rating' file sort does not mix with system:limit yet. I added a tooltip to the file sort menu button a couple weeks ago so it actually tells you now which do and don't work. That said, rating shouldn't be too difficult to add, so I'll look into it.
Edited last time by hydrus_dev on 03/29/2025 (Sat) 19:57:52.
>>17467 I would like to help you but I'm afraid I don't work on the Docker stuff, and I just have no Docker experience myself. Suika is on the discord if you want to talk to him directly. >>17468 Thanks--sorry for the trouble! I wish these guys made it more obvious which behaviour they don't like, but I guess they keep those cards close to their chest precisely because they don't want the bad guys knowing. For a fun story, we once got an anti-virus false positive because the scanner sandbox ran the hydrus installer exe in some automatic mode, and we used to have a 'open hydrus getting started guide' checkbox on the last page. The sandbox clicked this somehow, even I think when we set it to default false, and that opened Internet Explorer, which triggered a Windows Update network call on the sandbox, and that triggered the false positive since the installer was doing weird network stuff, hooray. >>17480 Sorry for the frustration! I'll put some time into the UI and help to make this less obtuse. >>17484 >>17488 I've got a job to fix this in the defaults as well, btw. Looks like they changed their HTML recently. Either me or another user is this week going to look at an API solution since it turns out their API provides tag namespaces. I'll figure out new downloaders for e926 and e6ai too. >>17489 I am no Linux expert, but it seems recent Wayland changes don't play well with hydrus. There's some help here, under the Linux tab: https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#installing Adding "QT_QPA_PLATFORM=xcb" env variable seems to be the simple solution; it causes hydrus to launch with X11 instead. Running from source (which I think the flathub does, I don't know?) generally improves compatibility too. It is easy to set up yourself now: https://hydrusnetwork.github.io/hydrus/running_from_source.html
>>17492 Thank you for your feedback. You have a lot of dupes, so you have a special perspective. I've had similar thoughts from some other users today, basically that they'd like this system to be a touch less automatic and fire-and-forget, particularly as we push from easy pixel duplicates to more fuzzy situations. The preview panel is good, but it is too tucked away and not obvious (as you found too). I am going to think about this seriously this week. Maybe the rules get new pause modes or something where they do their calculations but for the final action and then you can review that. Or perhaps I boost the preview panel to be more user-friendly and do a better job of showing upcoming outcomes. This is tricky, but I think I put real time into it. I built this system to be completely automatic, so my preference is to KISS and keep it that way, but several users want more control, so I'm thinking about what I can do without breaking the bank. Another thing is firming up our tools to make sure these fuzzy choices are less uncertain. If I add some tech that does 'A is 99.7% similar to B', to arbitrary precision, then we will be able to filter out slight artist corrections as 'exact match - distance 0' is currently unable. Another option is to rewrite my perceptual hash system to use a longer hash so we can go finer than the current precision on the search end. An alternative is plugging the new 'comparator' tech the auto-resolution system has into the normal duplicate filter. I could provide the queue of what a rule would work into the duplicate filter and arrange the A-B in the same way, and then you'd be processing it like an archive/delete filter maybe. I don't know, I'm thinking about my options now. >would it be possible to do something like that will the current duplicate filter? maybe as a companion to the large full image view? Yeah I think it could. I thought making that 'thumbnails-in-a-panel' part would be the most difficult of the whole system, which is why I left it basically for last, but in the end it wasn't too bad. I feel better about having a sort of carousel for the media viewer in general now; and the duplicate filter could definitely have a carousel of two. >>17497 Thanks, interesting. Presumably this 0.40 will percolate down to libmpv in several versions of Linux. I'm expecting to do another .dll test on the Windows side soon as well. We'll see what it fixes and breaks. >>17498 Thank you for this report. Please tell the Client API in manage services to 'log requests', and turn on help->debug->profiling->profile mode. Then see if you can trigger the fault, and we'll see if anything stands out. I wouldn't pastebin your whole log here, and the profile will be huge, but if you email me or hit me up on discord, we can figure out a secury transfer, or you can look yourself and just cut and paste the last stuff that happens before it breaks. I think the log will say how long each Client API job takes. Maybe there will be some errors about saturated connections or something in the log. (view your log and new profile by hitting file->open->database directory)
>>17500 Thanks--I would like to roll out exactly this over the nearish future. I'm increasingly thinking I'm going to need a fine 'A is 99.7% pixel-similar to B' so we can exclude colour changes and tiny artist changes and stuff that slip through 'exact match - distance 0' search, but we'll see how it all shakes out. Please let me know how it goes for you as I roll it out. >>17526 >>17527 Yeah, set them as alternate for now. That's just a safe landing zone to hold those pairs for now. In the future when we have a better grapple on duplicates, I'll be writing a large expansion of the file relationships system to let us define non-duplicate file relationships like 'WIP' and "costume alternate' and 'messy/clean' and so on.
the sort by color features are so fun!
(210.33 KB 960x960 gigasmug.jpg)

>>17536 It's pretty neat, but I think sort by color balance is slightly flawed, as it ends up with a ton of black and white images in the middle, while sort by hue places images with low chromatic magnitude at the end, outside the rainbow. Since you can also sort by chromatic magnitude, which places mostly black and white images at one end, I think it would be feasible to filter most of these out of color balance like they are for hue sorting. In fact, I think it might not be too hard to implement a system:color (assuming such a function can't be folded into a different system predicate) that allows you to search above or below certain color balance, hue, chromatic magnitude, and lightness thresholds. The system already fetches all the information necessary to perform such a search in order to perform these sorts, so it should be mostly just be making another UI window if it were to be added. I don't think it'd be particularly useful outside of fetching images to make those color collage images like pic related, but it would be neat.
>>17538 Also, it's personally not much of an issue to me, as I already tag images as "black and white", which lets me exclude them from color sorts. The chromatic magnitude option really helps find stray black and white images I've missed though.
>>17530 in terms of the fuzzyness, like you mentioned 99.7, would it be possible to have a "most likely compression noise" or when there is an areas where the details are quite a bit more firm "most likely same set with slight difference" I think that those two would be the... well I don't want to say easy, but probably the easiest to make a filter set know as for the window with thumbs, it would be really nice to have on the side a few buttons [a is better] [b is better] [related alturnaties] [false positive] we can give a quick scrub through of the thumbnails that come up and pick out the obvious, and what's not obvious, goes to full on opening the images and doing them one at a time while a/b ing them hell, even the 'is better' isn't necessarily true, its just that im keeping one of them and not the other. for me where things are going to get fuzzy is png's and jpeg's where I take the pngs out of hydrus, downscale/reencode to jpeg and re import, I could honestly see my biggest auto resolve being png vs jpeg where they are 99.5% the same but the jpeg is half the size or less. I have already gone though around 200gb of files like this that I got rid of, not to mention the ones I kept to be processed later on. I would find auto'ing this to be immensely helpful if they are held for review, I personally see that as the main thing I would change/add, automatically process but don't automatically delete/move to trash. on a side note, I think it would be nice if you were able to open a window of all the resolved pairs, this is more of a me problem, but I had a number of files already in trash, this dumped 19.5k files into the trash along with what was already there. it made looking though it a bit harder than it needed to be, though you can see by scope im probably a bit of an outlier in that regard.
>>17541 >on a side note, I think it would be nice if you were able to open a window of all the resolved pairs, this is more of a me problem, but I had a number of files already in trash, this dumped 19.5k files into the trash along with what was already there. it made looking though it a bit harder than it needed to be, though you can see by scope im probably a bit of an outlier in that regard. Not sure if im understanding correctly, but i think it is the case that if you open the trash domain and then 'sort by time:import time', you will get the files sorted by the time they got actually imported into trash rather than imported into hydrus, which would be useful for you wouldn't it? So everything already there would be at the beginning or end of the thumbnail grid depending if you chose newest/oldest first. With the system predicate system:time you can search then for stuff older/younger than a day or stuff like that.
Is there a way to change the design of this board back to what we had recently?
>>17543 The dropdown at the top.
>>17544 Thanks! In another browser i can see it, not in my old not updated one though :p
I have no idea what this tag data is and would rather clear than upload it. Forget however does nothing A search for "exclude current tags" + "has tags" on the PTR gives no results. Any idea?
I had a great week. Last week's duplicates auto-resolution test suggested we needed some new tech for semi-automatic behaviour and an audit log, and I managed to get it done for a second test. I also have some misc bug fixes and a new e621 downloader that works a lot better and fixes the recent broken tags. The release should be as normal tomorrow. >>17548 Did you earlier do a 'commit' of your normal pending mappings, and this was 'left over'? If so, this could be a miscount--can you try database->regenerate->total pending count, in the pending menu? If this is not a miscount, it could be pending siblings, parents, or deleted (petitioned) mappings. Check the tags->siblings/parents dialogs to see if there are any there, although I presume you'd know if you pended a couple thousand siblings. Petitioned mappings are difficult to search for arbitrarily, but if you already flushed your normal pending mappings, the petitioned should have gone at the same time (same for siblings and parents--I think this is a miscount).
>>17549 Thank you, the regenerate did it
good morning sirs first of all thanks for this great software, it's really useful and I haven't even touched downloaders yet... Second, when viewing my stuff, is there a way to group them based on the file's modification date? (e.g. group by month) I dumped all my ShareX screenshots into my db, originally they were in folders like 2024-7 or stuff, so it was easy to check for a given month, but right now all of them are just there in a big pile. It would be nice to see them on a month basis.
https://www.youtube.com/watch?v=m8y066LgdUg
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v616/Hydrus.Network.616.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v616/Hydrus.Network.616.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v616/Hydrus.Network.616.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v616/Hydrus.Network.616.-.Linux.-.Executable.tar.zst I had a great week. The e621 downloader is fixed and improved, and there's a new round of testing for duplicates auto-resolution. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights e621 have been changing their site recently, and it broke our parser for getting tags. Luckily, it turns out they have an excellent API, and today I am rolling out an update to the main e621 downloader that works much more efficiently. Thanks to the users who helped with it all. I have also added new downloader entries for e926 (their sfw alternative) and e6ai (their AI containment booru). All also now support pool URL parsing. If you have e621 subscriptions, you shouldn't have to do anything--but note that it will spend a bit of CPU one-time catching up before getting back to normal. Let me know if you run into any problems! If you use subscriptions a lot, you'll have seen a popup sometime that said 'subscription found x new URLs without running into any it had seen before' with a button to make a 'gap downloader'. This usually happens because either a guy uploaded a whole bunch of files very quickly or the site changed URL format. From now on, the client tries to detect the latter situation and will not bother you with the popup spam any more. This will happen this week with the e621 subs. Thanks to a user, the Media Viewer's 'eye' menu button gets some experimental commands to set 'always on top' and remove the window frame. Also, there's a new shortcut, and a new checkbox under options->media viewer that lets you always return to the page that launched a media viewer on media viewer close (helpful if you use multiple media viewers). You can now mix and match multiple system:hash predicates. Export Folders that have multiple sidecars that export to the same location are now fixed (previously only one sidecar was actually writing to the file). A new label on the Export Folder UI also notes that if you change your sidecar settings, you need to delete any existing sidecar files to get it to regen and update. Sorry for the trouble this caused recently! duplicates auto-resolution The system is not launching for all users this week--I am going for Test Round 2. Thank you to everyone who tried out duplicates auto-resolution last week. The system performs technically well, but the feedback was clear that we need A) a way to pump the brakes and allow a human to optionally approve actions, and B) a way to review all the actions a rule has made, just to check things are all good. I pushed hard this week and got both features ready for a test. Note that if you still have last week's jpeg/png rule in your system, it will be deleted on update. You'll get a popup about it. Please check the updated help here: https://hydrusnetwork.github.io/hydrus/advanced_duplicates_auto_resolution.html The main difference is: instead of a pause checkbox, rules now have a dropdown of 'pause', 'semi-automatic', and 'fully automatic'. New rules default to this new 'semi-automatic' mode, which means a rule will do search and test work but instead of actioning the pairs, it puts them in a queue where you can review and approve/deny them. There's a new 'review actions' button on the main sidebar panel that lets you check out these pending actions and the previously actioned pair audit log. I've also improved some quality of life all over, including showing on preview thumbnail lists the expected content updates if the action is taken (e.g. whether A, B will be archived, and which tags, ratings, URLs, whatever will be added). I think some of the UI will need more work to fit IRL data, particularly the content update preview. Please have another poke around and let me know how it goes! next week I need an easy week, so I'll just do some boring cleanup.
(155.85 KB 959x417 Screenshot_20250402_203951.jpg)

>>17553 >unit test This is a guaranty the software is solid. Thanks devanon.
(215.84 KB 1366x724 Screenshot_20250403_063815.jpg)

I just found out about the new zoom stuff. Super cool. Thanks!!!
Does anyone have an updated reddit gallery parser? Mine is still broken. I've been trying to make a new one, but it looks like the content is mainly loaded with javascript and doesn't show up in the test data fetch from the url. Is there a way to get around that, or a tutorial I can follow?
>>17551 system:time should have what you need for basic searching by months, but it's kind of clunky for going through month by month, since I think it would need to be after month 1st but before month 28th/30th/31st. Makes me think of a suggestion for a time collection, collect by year/month/week/day, or some quick time selections like how resolution has 16:9, time is x month sorta stuff,
>>17553 Double-clicking a row in the preview shows one of the files, the same one for the row. Which one?
In the file relationships predicate, I'd really like to have a way to specify the max distance for "potential duplicates" relationships when searching. Then I'd be able to have a search like "files with no distance 2 potential duplicates" And I'll have files that don't have any current potential duplicates at that distance, but they're still allowed to have ones at a higher distance that Hydrus has already computed. I'd make having Hydrus search for potential duplicates at higher distances much less of a problem.
for json parsers, would you mind adding a way to select keys based on their values, and to select keys that are on the same level, instead of only being able to go deeper in? You can do this with html parsers, but I don't see a way to do this with json, and it's necessary for a parser I wanted to make so I'm stuck atm.
>>17564 I can show you a concrete example of what I want to do if that would help you understand. lemme know
>>17553 >the 'search enabled' system, which on very old subscription presentation pages would allow for a search page that just presented files and had no search panel in the sidebar, is completely removed. the echoes of this thing have caused several typo problems over the years, was anti-KISS, and I intend to completely replace it with a nicer dynamic search freeze/sync system that will sync the page's current files to a fresh system:hash pred If I have any of those old pages still hanging around, will they be deleted on update?
New select option "All of this file type" would be convenient. I can't figure out a way to do with easily except sorting by file type and manually selecting. My main use for this would be selecting archives Hydrus has downloaded among images so that I can export and extract them.
>>17553 oh for fuck's sake this update wiped my defaults on the e621 importer, so it all imported to downloaded tags instead of my separate E621 tag repository
Any ways to prevent automatic metadata fixing on images when I edit then to remove/add some shit to them? Seems like it detects any changes when I change the image resolution.
Hey so, today my computer flashed a "Repairing drive" prompt for a couple of seconds and then this happened. How fucked am I?
>>17573 Uh.... likely EXTREMELY fucked. Hope you backed your hard drive.
>>17573 Maybe one of the database tables got moved/renamed? Look for it. If not, restore backup. You did back up your database, right?
>>17536 >>17538 >>17539 Yeah I like it too. It was mostly just for fun, but a user sent me some interesting articles about how to calculate the 'primary colour' of an image better than just taking an average, so I've got some homework to do and then I think I'll implement that and it'll be more human and useful. I generally want histogram data too and colour search, and since the sort wasn't actually that difficult to do, I'm feeling more enthusiastic about it than I was. Basically if you convert to HSL or Lab this is all trivial, and now I understand why colourspaces like that got invented. >>17541 >would it be possible to have a "most likely compression noise" or when there is an areas where the details are quite a bit more firm "most likely same set with slight difference" That's the hope, but what the actual technical answer is, I do not know yet. I remember reading a paper a million years ago that attempted to do jpeg artifact detection, specifically trying to determine the % jpeg quality from pixel inspection alone, and the answer was 'we could not figure out a super reliable way to get it'. My plan is to write a simple pixel-comparator and then we play around with 99.7 or 99.997 or 95 or whatever and see if we can find some sweet spots that generally differentiate 'these are exactly the same bug for jpeg encoding' vs 'there is a watermark in the corner'. We might need to make it more complicated than a general comparison, maybe add some standard devations in there or grid-based location distribution since a watermark is focused in one area but jpeg stuff will be all over. But I don't know what we are looking for, so I want to get some IRL numbers back and then iterate on where it fails. >it would be really nice to have on the side a few buttons I am sorry to say I don't want to do this. I want the auto-resolution system to push users towards automation, where a failure state encourages the user to set up new rules for better auto-filtering (or gets me to write better auto-filtering tools), rather than sucking in more human work time with another full-on filter inside it all. Just a simple yes/no confirmation as current is about as far as I want to go. On the other side of things, I'd love to update the existing filter with the tech we are figuring out here--for instance adding a "these files are 98% similar, no watermarks" in the list of comparison statements in the dupe filter. If users still wanted it, I could also just let you load up a semi-automatic pending queue inside a duplicate filter, rather than have you poke around the thumbnail pairs. >on a side note, I think it would be nice if you were able to open a window of all the resolved pairs Yeah, and a list for the declined too. I can figure out some buttons for 'spam all these pairs to a new page' and even though that makes a gigantic grid of thumbs, I can at least preserve pair order. >>17542 I expect he wants to see the actual pairs, both files, rather than just what was deleted.
>>17573 Sorry to hear this! Most of these problems are recoverable. Go to "install_dir/db/help my db is broke.txt" for your next steps. Hit me up if you run into any trouble with it. >>17551 >>17561 Yeah you can search for time with 'system:time', and in the top-left you can change the file sort to time->modified time. Hydrus does not have any nice 'group by' UI yet. I've been thinking of it for ages but it just hasn't happened. I expect to revisit the thought seriously when a long-planned complete overhaul of the thumbnail grid done. I'm hacking around with old custom bullshit code right now, but if and when this stuff works in a more Qt-friendly way, it'll be much easier to add in separators and stuff for 'group by' display. >>17554 I'm shit at this usually, but since this is an automated system and I'm terrified of something going wrong, I'm trying to do it more professionally. We'll see how it all goes! >>17555 A user contributed most of the tech here, so it is thanks to them! I'm happy with how the integration worked out, but let me know if you think it could remember some option better somewhere etc.. The same guy did the always on top and frameless stuff this last week, so give those a go too. >>17562 It loads up the AB pair, both files, starting with A. I'll see if I can change the labels in the media viewer from '1/2' and '2/2' to 'A' and 'B'. >>17563 Thanks, I will see what I can do. >>17564 >>17565 Yes please; sorry to be a pain, but can you give me a simple pretend example of what you have in your document and what you want to test and fetch, so I know exactly what you are going for? I am a shoddy sentence-parser at the best of times. >>17566 No they'll just have a search panel now. They will probably have no predicates in them, so don't hit F5. If you want to hang on to the files more safely, do a ctrl+a on them and open->in a new page, and you'll get a fresh page that's more like the modern 'file publishing' system, where the page starts with a 'system:hash' with the files so it survives an F5.
>>17573 I should follow-up here, since you haven't just had damage, but a whole important missing table(s), fixing which will need me at some point: check your log file, also in the db directory, and scroll right to the bottom. Let's see what tables are missing. If it is just one table and something boring, we'll be able to re-insert a stub and get you booting again, but if there are many missing tables of very critical data, we may have a tougher time. If you would like to talk one on one, please email me or DM me on discord. Also I think the document says this, but your absolute first priority is making sure you are on a stable and healthy hardware situation. If that hard drive just had a fault that windows recognised and stepped in to fix, and you know it wasn't because of some recent one-off event like a power cut, you cannot trust it. Do not do repair operations on the drive since this will only push it harder and the repair may break things even worse. Time to exfiltrate everything off any broken hardware you know of. If you have a backup, this is the time to roll back...
is there a way to set defaults for the filename and directory namespaces under "misc" in the "add tags" import dialog? I could've sworn there was a place to set the namespace defaults, but I can't find it.
(113.38 KB 1280x720 mault.jpg)

>>17582 For backing up files, am I correct in thinking that I only have to back up the following tables: client.master.db, client.mappings.db and client.db? Caches can be recomputed, yes? More importantly, wouldn't it be possible to separate out the PTR tables from the rest and back up only non-PTR data? It seems redundant to back up something that can be downloaded at any time, with only bandwidth and some processing time being an obstacle.
>>17580 >colour search Use colors as additional tags when suggesting tags?
(344.00 KB 1103x633 Screenshot_20250406_100624.jpg)

>>17586 >I only have to back up the following tables: client.master.db, client.mappings.db and client.db? Actually you need to backup all four .DB files. In my case, and as a measure of ultra-precaution, I do it every time before I upgrade Hydrus.
Has anyone found a way to download NSFW images from CivitAI?
I was thinking about changing to Hydownloader for e621 and would like to hear your thoughts for e621 or any other booru style sties. And if any of this would apply to gelbooru, ATF, of others. Right now, I think the standard Hydrus Network downloader is better: Since Hydownloader uses Ids as the anchor it can’t fetch replacements, replacements are a thing for some of the bigger artists Fetch updated tags without having to re-download the file, e621 is the only site I know that users go back on older posts to update them. Automatically stops fetching dead subscriptions, while this is irrelevant on most sites but the amount of e621 subs I have (3,065) makes me wonder if this is significant. Automatically download deleted posts from archive sites (after playing around with the url classes and parsers). But Hydl has Pools fetching.- pools change leading to the same issue as updated tags. And there is a way to handle pools for e621 in Hydrus: >start a downloader page for e621 pool search >insert all the pools >let hyudrus finish fetching all the pools.json, it detects the image is already in the db and doesn’t ping the site for more >select all queries -> show files-> default presented files >select all files and incremental tag all of them with something like e621_pool_order:1…9999 >next time you do it use the tag manager to delete all tags with namespace e621_pool_order Note fetching for translations.- But they still haven’t been implemented in Hydrus Network and I figure I could do something similar to the above if there was a mass delete notes function. Unfortunately e621 doesn’t do queries with “notes:*” but It does “notes:*a*” which I think is close enough.
>>17183 (OP) Started using Hydrus and it's amazing but sorry if I'm being dumb or some shit but how do I make the gelbooru downloader(from gallery) also download loli/"all site content"? URL import works fine but if I add an artist to the gallery downloader it skips over those. Tried to login so it would have the option toggled but didn't do much.
(20.29 KB 1658x104 20250406_214853_Screen.png)

What does this mean? Do I need to install something? I thought PySide will be installed in the venv? Context: this is the result of running the setup_venv.sh script when trying to build from source
>>17596 What version of python are you on?
>>17597 3.13.2
>>17597 >>17599 I'm trying the (a)dvnaced install an selecting (t)est each time as referenced on https://hydrusnetwork.github.io/hydrus/running_from_source.html#summary <If you are already on newer python, like 3.12+, that's ok--you might need to select the 'advanced' setup later on and choose the '(t)est' options. If you are stuck on 3.9, try the same thing, but with the '(o)lder' options (but I can't promise it will work!).
HOLY SHIT WHY DID I NOT TRY THIS SOONER I CREATED A .desktop FILE FOR HYDRUS AND SET Exec=env WAYLAND_DISPLAY= /sekritpath/hydrus/hydrus_client.sh WHERE 'env WAYLAND_DISPLAY' IS IMPORTANT AND FINALLY THE WAYLAND ISSUES ARE GONE (because I'm just not using it lmao) AND VIDEOS ARE NO LONGER CRASHING MY SESSION (WITH NO SURVIVORS) AAAAAA I AM SO STUPID AND THE WORST PART IS I CHANGED MY OUTLOOK ON LIFE AND SHOULD PROBABLY JUST STOP HOARDING THIS STUFF ANYWAYS Oh well, better late than never.
>>17569 Did you ever try to click on the 'system:filetype' predicate? There you can filter to show the filetypes you want and then you can select all, which has the same effect. When editing that system predicate you can click on an arrow before a category like 'images', it will expand and you can check the checkboxes for each extension that is supported. You can also just type it into the search query, something like 'system:filetype is zip' or '... is rar'.
>>17601 nigga just stop using wayland, it's the goyest of goyslop
>>17602 Is that available on import pages? Can't check right now, but I thought not?
(4.33 KB 512x102 Content Parser - 242B.png)

sometimes, an artist on Xitter will make art in response to a scenario that someone comes up with. usually when they do this, they'll quote the post that they're making the art in response to. I made an fxtwitter content parser for this to save the text of the quoted post as a note on the file.
>>17605 Hmm no you can't but there is a workaround. You can select all files on your download/import page like in your screenshot. Then right-click again any file -> open -> in a new page. This copies all files into a new page that has the search bar where you can enter what i mentioned. If you want another safety net, you can righ-click your import/download page at the top and duplicate it before doing that, but isn't necessary.
>>17596 >>17599 Ah okay. PySide6 supports python 13 starting with 6.8.1. So you'll either need to install python 3.12, or install the rest of the dependencies as-is, then separately pip install Pyside6 latest.
>>17596 >>17597 >>17599 >>17609 Sorry for the trouble here. This >>17600 should be the solution, so I'm not sure what is going on if you are still getting that error. Choose the (a)dvanced update, and when it asks you about Qt, go for the (t) option. This is the critical part: https://github.com/hydrusnetwork/hydrus/blob/master/setup_venv.sh#L79 - question https://github.com/hydrusnetwork/hydrus/blob/master/setup_venv.sh#L212 - execution Which for a couple months now has gone for "PySide6==6.8.2.1". If you are on an older version of hydrus, this wasn't true, so if you aren't on v616, try updating. If (t) still does not work, or you cannot update and need this to work on older code, try (w)rite your own and put in 6.8.2.1 yourself (2.4.3 for the QtPy question).
Any plans to implement 'duplicate sync' after the duplicate auto-resolution? >a job built in to apply duplicate tag migrations again to get new tags from previously deleted files
I took an easy week, just cleaned some code and fixed a few bugs. The release should be as normal tomorrow.
So I might be ignoring something very obvious in the documentation, but does Hydrus have any feature yet that assists with organizing and displaying things like artbooks and doujins that require a specific sequential order and that you might have an interest in always opening collectively, either in a single instance of the image viewer that functions like a comic reader, or as a group in a new page, for example? I've been managing to juryrig a way to get a "bookshelf" kind of effect by having a specific "this is a doujin/artbook" tag on each cover page, which I can then do a search for in order to pull up the whole "library" and use the title:[whatever it may be] tag on the side panel to actually "open the book", but I'm wondering if I'm ignoring some kind of functionality that makes it easier [like Collections, I have no idea what the Collections and Unmatched dropdown on the sidebar is for], plus the "book" breaks a bit of the pages are imported out of order. Additionally, unrelated question, is there any way that I can easily see what all the tags that I have sharing a single namespace are? I would like to be able to look up, for example, the creator: namespace and get a list of all the actual different tags that use that namespace, it'd help me when chasing down particular namespaced tags that I've assigned myself but forgot existed from not using them too often.
>>17581 >Yes please; sorry to be a pain, but can you give me a simple pretend example of what you have in your document and what you want to test and fetch, so I know exactly what you are going for? I am a shoddy sentence-parser at the best of times. Sure and honestly I don't know why I didn't just show you to begin with. "raw_text": { "text": "can we do that thing?", "facets": [ { "type": "tag", "indices": [ 21, 47 ], "original": "blue_hair" }, { "type": "tag", "indices": [ 12, 91 ], "original": "skirt" }, { "type": "media", "indices": [ 96, 119 ], "id": "188676424236144" } ] } This is the kind of structure I'm trying to parse. I want to get the tags. I need to look at the `type` key and make sure that the value is `tag`, then if it is, I want to grab the value of the `original` key on the same level as the `type` key that I checked. Basically what I want it something like "Search keys of `type` with value `tag`, then get the value of the `original` key on this same level." tags aren't the only things that can be the value of `original` keys, so it's not enough to do it bluntly like that. I have to make sure that the `type` value says that the content of `original` is indeed a tag, or the parser will start adding junk tags.
>DatabaseError: database disk image is malformed Well after several years it finally happened. Something happened while I was away and Hydrus appears to randomly crashed silently. Nothing in the logs. Weird. Restarted and eventually the DB errors started appearing. I suspect a RAM issue, but not sure. Probably when it crashed it was writing to a DB. Running v616 on Windows 10. Also my DBs and client_files are stored different drive. Chkdsk finds no errors on either drive. So I don't think I've lost any actual files, just corrupted the DBs. Pastebin of client log @ https://pastebin.com/bccvnemf Shut everything down. Backed up all 4 potentially corrupt DBs twice. One to work on, one just in case. Also have a backup from 2 weeks ago I made before jumping from v615 to v616. Running thru the "help my db is broke" readme and client.master.db passes integrity check, but the other 3 all fail. Unfortunately, the first step in attempting to repair the DBs by using the .clone command silently fails. All that happens is the command runs and closes the sqlite3 CLI instantly and I end up with a 0kb "cloned" db. This happens for all 3 DBs. This is clearly not what is supposed to happen. I grabbed the latest sqlite3 CLI tools as they are a slightly higher version number and get the same results. So that's a fail. Moved on to trying the .repair command. Started with client.caches.db as it was the smallest borked DB at 14.6GB. The readme says this is "very very slow". Can confirm, I let it run for about 36 hours before giving up on it. At this point I just restore my backup from 2 weeks ago. The vast majority of everything is "backed up" on the PTR anyways. Figured I'd just restore from backup, regenerate the DB stuff, resync back up with PTR, and call it a day. Nope. Pretty quickly I get the same malformed DB errors again. Well fuck. I'm worried this may have unearthed some random issue that's been lying dormant for years and may invalidate my backups. I have not yet attempted to try using the .dump command as that's apparently even slower than .recover. Currently I've restored the backup from when I first started getting errors. Hydrus boots fine and I immediately went in and disabled anything that I could think of that could make any DB writes to try and minimize any additional damage (disabled client api, networking, paused all subscriptions, pause PTR sync). Exported files in each open tab to a folder, so I know what I was working on. Backed up my subscriptions. Now I'm waiting on an export of All 650k+ files and all the sidecars I could think of that I would want. It's a slow process and possibly overkill. Still not as slow as the .recover command though. I figure worst case scenario I waste some time and have an extra "plain text" backup. If I do have to nuke everything from orbit I can just re-import everything with sidecars. Doing random spot checks as it exports and I have not found any obvious issues with the export. Also is there any way to export ratings to sidecars? I've created a few but they're not majorly important. When I first started with Hydrus I was using few local tags like "rating:favorite" or rating"cute" before I really dug into the custom rating system. And autism compelled me to keep using tags for ratings even after creating ratings. Because duplicate work is fun! But it means I will lose little to no ratings, but it'd save a lot of clicks if there was a way to export ratings too. Is there any hope of recovery? Anything else I should be exporting "just in case"? Or do I just cut my losses and start over and re-import?
man Jpeg-XL is crazy good! I just exported a psd file to a lossless jxl, and compared it to the png version once I imported it into Hydrus, and it's less than half the size for the exact same pixel content. absolutely wild!
Would it be possible to implement a wildcard filter in the tag black/whitelists? e.g. I'd like to filter out all the Pixiv tags ending with "users入り". Entering "*users入り" into the blacklist doesn't seem to have the desired effect.
>>17619 >is there any way that I can easily see what all the tags that I have sharing a single namespace are? character:* ---> Enter
>>17628 No, that simply calls up a search for every image that fits the namespace, doesn't it? I'm not interested in the images with that question, what I actually want is a place where I can see an ordered list showing me every single tag there exists/I've made with the same [whatever] namespace that goes like LIST [namespace]: Tag A [namespace]: Tag B [namespace]: Tag C [...] and so on and so on Though I guess doing a global search WOULD populate the sidebar with ALL the namespace tags by necessity but that seems like way too much of a brute force solution and I'm hoping there's an easier one
I got a simple (I hope) feature request. Could you make it possible for the "time" content parsers to be able to set the modified time for another domain? Currently they can only set the modified time for the domain the parser is for, but sometimes, a site will actually keep track of the original post date of a file for a different website (usually where it got the post from) so that'd be very useful to have. In the gui, I imagine it could just be a checkbox that, when checked, would activate a small textbox to put the domain that you want it to be for.
https://www.youtube.com/watch?v=YQtHzDNAosY
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v617/Hydrus.Network.617.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v617/Hydrus.Network.617.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v617/Hydrus.Network.617.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v617/Hydrus.Network.617.-.Linux.-.Executable.tar.zst I took an easy week, so I just have a couple bug fixes! Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights If you need to import gigantic AVIF or HEIF files, like > 15,000x15,000, they should now work! The e6ai downloader now grabs the 'director' tag under the 'creator' namespace. Thanks to the users who let me know how they do this. If you have clever custom tag or note import options, I have added 'Gallery URL' url classes to the network->downloaders->manage default import options dialog. This allows us to set custom options for downloaders that have a gallery url that produces direct File URLs, just like the new e621 downloaders I rolled out last week. If you would like custom import options for the new e621 downloaders, please go in there and set up custom options for both the 'eXXX file page api' and 'eXXX gallery page api' classes--use the copy/paste buttons to make it easier. I generally do not like how complicated this has worked out, so if you would prefer it just work on domain--or really wouldn't want that--let me know what you think! If you need to do a lot of file duplicates reset work, the commands for 'dissolve duplicate group', 'dissolve alternates group', 'delete false positives', 'reset potential search', and 'remove potential pairs' are now added as mappable actions for the 'media' shortcut set. These are powerful commands, always wrapped in a yes/no confirmation. next week I have some bug reports to put time into, and if I have time I will do a bit more duplicates auto-resolution tech.
>>17622 Sorry to hear about your trouble. If you still have the backup from two weeks ago, I think you are in a great position and can recover completely. First we need to make sure your hardware is ok. It sounds very much like you do have a RAM issue. Or could be a motherboard, power, or SATA cable thing. Some sort of intermittent hardware problem that is going to cause I/O to fuck up. I think you cannot trust your hardware for now, and before you try to set up a new client or try to restore your backup again or do any other recovery, you need to check that things are ok. I haven't done this myself, but I know there are some ram checkers around. Microsoft make one, maybe it is built into Windows, yeah, try searching 'Windows Memory Diagnostic' if Win 10 has that--it'll reboot and check your ram. Maybe it finds a problem, maybe not, but if it does, then you know what's going on, and you have to replace it before you can trust the machine again. Otherwise, crystaldiskinfo checks seemed to work ok, but maybe you can check your SATA cables anyway--sounds stupid, but are they seated good? Any kinks? Once you are very confident your hardware is happy (or just move to a new machine for now), I'd say just restore your backup from two weeks ago and see if it has malformed errors now. It could be the malformed errors are not, say, because of write I/O suddenly breaking a good db over and over, but read I/O, and everything was fine actually on disk, but you couldn't tell. If you need to extract ratings, you can only do it with the Client API for now, but bear with me, I will make sure we figure out a nice solution--I can hack it into sidecars in the next week or two if needed. But don't think about that just for now--I'd say make sure you have your best possible backup available on an external USB drive you trust, and then you do every test suite you can find to figure out what, if anything, is wrong with your machine. If you are totally sure it is ok, and the original crash was a power-off event or some other one-time weirdness, then we'll see what we have and what, if anything, we can fix or export from. But you being unable to clone right now suggests to me you still have hardware problems. Very similar clone problems happened to another dude a couple weeks ago, and it was a bad ram stick. Let me know how you get on!
(118.51 KB 570x489 Screenshot_20250409_224857.jpg)

>>17629 >I'm not interested in the images with that question, what I actually want is a place where I can see an ordered list showing me every single tag there exists/I've made with the same [whatever] namespace that goes like >ordered list The alphabetic order is good enough for you?
>>17631 Have you messed around with code for scrolling recently? Because I've been having incredibly annoying issues where scrolling would be skipped, especially in the duplicate window. Seems to happen most when I'm moving the mouse while scrolling. I updated my venv just in case, but it only isolated the issue to scrolling while moving, it still skips scroll events. Might not be an issue on the Hydrus end, though.
>>17634 +1, also have had this issue recently. It was at least present on 614, haven't finished my backup before updating to 617 so idk if it's changed.
>>17633 The problem is that I still hit the same "issue" as before of needing to actually run a full image search, for that. So if I have tagged twenty thousand images, with 15k of them using a namespace like user:tag A, 4.9k using user:tag B, and only like, 1 image using user:tag C, 2 images using user:tag D, etc. etc. all the way to Z and my goal is to remind myself of what user:tag C through Z actually were because I want to remind myself that they exist so that I can start using them more in the future, I'd still have to run an image search where I needlessly process through those 15k instances of user:tag A images and 4.9k user:tag B images, loading them all up, rather than having a less brute force option like, say, in the "tags" pages in the options menu having search field where I can simply input the namespace user: as a search and have a list of tags that have that namespace with less of a strain since it won't have to load the image data. tl;dr a search option for when I am not interested in looking at all the images that have user:tag a, user:tag b, user:tag c assigned to them, but simply in being told "hey, we looked up user: namespace and found out instances of tags a, b, c, etc. have been assigned SOMEWHERE using that namespace"
>>17633 >>17636 Just don't hit enter, then?
>>17637 Oh yeah, forgot that only lets you sort most to least used.
>>17637 Weird, mine doesn't seem to populate like that
>>17639 Wow, that must really gimp your searching ability. Hope you get that figured out.
Hey I was wondering if there was a way to update the tags on an image by redownloading the same file. I realized that by using the simple downloader the last 3k images that I've downloaded only downloaded with one tag.
>>17641 Select your images from the gallery --> Right Click --> URLs --> Force Metadata Refresh This will tell Hydrus to re-check and grab any updated tags. Though if your downloader only grabbed one tag, it's possible your downloader may be not working right. But assuming it works correctly, the above should work.
>>17642 Thanks but the refetch didnt want to work. It gave me this error. The downloader was fixed with updating the client though.
(29.91 KB 826x479 Screenshot.png)

>>17639 Seems you didn't check some checkboxes. Click on 'tags' in the menu bar on top -> 'manage tag display and search...' -> now for each tag service (tabs on top !!!!!), you can set different behavior for autocomplete in the 'autocomplete' section (lower half where all the checkboxes are). Keep the mouse cursor on top of each option to get a tooltip. If you set it up kinda like the example in the picture, you can write 'user' and you get all user namespace tags and normal blue tags listed. By writing 'user:' or 'user:*' you will get only all the user namespace tags, without the blue tags. Results are sorted by count though and can't be changed afaik. Keep in mind that if you check the 'always autocomplete' checkbox, hydrus will start fetching alot of results immediately while writing. The number of characters only applies if the checkbox is not checked. Changing those options should take effect immediately after retyping at least a character in the search bar, no need to restart the client or something. It's not that easy to understand what option does what imo first, so just play around and test the options (maybe with the PTR) to see whats good for you and your hardware.
>>17644 Ah, toggling the allow "namespace:*" option gives me exactly the kind of behavior I want [well, so does the regular "namespace:" option but I feel like it's better for me to limit it to the * option so that it's a bit more precise in when it triggers and only comes up when I am actually trying to make use of the feature], thank you! Funnily enough, however, this particular tooltip describes exactly the current behavior of my client, even though the option for it is toggled off [across all tag services, too]. Odd.
Probably a strange niche request, but would you mind adding a way to add a keybinding for a custom action in the duplicate filter? By that I mean a preset custom action, like "mark alternates, don't copy tags, delete the other" and stuff like that, except that pressing the keybinding just does the whole thing without you having the open the window and click through each time. There a certain kind of custom action I do very often with certain kinds of files, but it doesn't make sense to make it the default, so I just have to manually enter the custom action every time. It'd be cool to have a shortcut to make that way quicker.
>>17642 damn I've been manually opening a new downloader page and forcing the refetch by forcing a page fetch for years. I didn't know there was a context menu action to do it automatically. Nice! that'll save a lot of time in the future
>>17632 Definitely a RAM issue. Windows Memory Diagnostic was somewhat useless. All it gives you is a pass/fail message. The RAM failed so I ran MemTest86 for a few hours. It failed spectacularly with over 300 errors on the first pass, so safe to say that's what caused the random crash. New RAM will arrive tomorrow. Once that's replaced what would be the best route to restoring data between the backup and the crash? Just overwrite the DBs with the backup and manually re-import anything newer than the backup?
>>17645 >Funnily enough, however, this particular tooltip describes exactly the current behavior of my client, even though the option for it is toggled off [across all tag services, too]. I initially didn't mention it but i think it is because the behavior comes from autocomplete, that means if you change your '2 characters' to somethink big that you never reach like 99 while keeping 'always autocomplete' checked off, you can test the behavior of the lower checkboxes better i found. So if you write 'seri', with 99 you get six results (serie = 5, series = 15) on the PTR + 'all known files with tags' location. With 2 (or 'always' checked) you get alot of series namespace results, but also not everything. What i found is, that autocomplete only considers blue tags and on namespaced tags everyting after the colon (:), so you might think it consideres every series: tag but if you look closely, it gives you only results where 'series' is mentioned on blue tags (multi-work series) or on namespaced tags after the colon (series: devil may cry (series)). The comparatively few entries on top without any series mentioned after the colon like 'series: final fantasy' always only show, because they have a sibling or parents that has them mentioned after the colon. Thats why you need the lower checkboxes, they are considering also every namespace result. The best way for me to confirm that, was by searching for 'meta' on PTR with that checkbox you show in your image on or off. With 'off' you find very few dark grey meta: tags, with on you find alot. But i have to correct myself, the search isn't updating always on retyping one character. For me it mostly does, but not always. Better empty the search bar and write your search again for more constant behavior after changing these options. What i kinda can't wrap my head around is the 'unnamespaced input gives...' option, what it can offer that you can't have with the other options and why when this option is checked, autocomplete behavior seems to be active even though i have 99 characters and 'always autocomplete' checked off when trying this option. I need concrete examples what you can do with it that you can't with the others, i don't understand the tooltip to be honest.
Finally updated to test out the early auto-dupe resolution, works good for the test run. I could have sworn I had a lot more valid images, only about 2k dupes were auto processed from about 3.5 mil searched. Like the other anon said, it tested like 80% of the images and then I had to press work hard to get them all done, I personally wouldn't mind it doing a hard run as it searches everything, maybe a prompt or a checkmark to run a search upon rule add might be nice? I first did a batch, they all looked fine, although some of the thumbnails didn't match, as in some looked like a lower quality thumbnail resize, I think I probably had thumbnails set as better/worse quality when one of the files came in and later increased/decreased it. But the files themselves matched, so I think it's fine. I definitely can't wait for more freedom to add rules, in particular png to png & jpeg to jpeg dupes, as those appear to be a majority of the remaining dupes. Also special shoutout to adding jpegxl filetype support for the auto-resolution, as I have to convert my jxls to post anywhere, and I usually get them back as pngs at 2x the size. I have a partially related question/concern regarding psd files & a feature idea. I have a bunch of psds which are essentially the source for a variant set, e.g. the one psd has all the variants under different layers and they can be toggled on & off to generate the variants. These often show up as pixel dupes or very similar dupes, I think because of the way psds are rendered in hydrus. However these are obviously not exactly dupes, and I keep the psds around for a reason. Is there going to be a way to prevent the auto-resolver from doing these no matter what? Like a short circuit negative search, like "if system: filetype is image project file; don't search any of the other rules" or something? Obviously you can exclude them in every rule but for something where you might never ever EVER want to auto-resolve it might be nice. Maybe have them be part of the default suggestions to keep unaware users from accidentally wiping half their libraries? I guess the default suggestions could stay limited to image files too, prevent it from even happening in the first place. I'm also imagining a situation where maybe you don't want to auto-resolve a dupe from some sort of set so you add a tag like "meta:don't auto-resolve" or something. Just some thoughts, I haven't considered all the knock-on effects so it might actually be a real pain, IDK. Nitpicks aside, I've been waiting eagerly for this auto-resolution tech and I'm pretty pleased with it so far. Great job, hydev!
Hey, discovered the software this morning and spent the entire day playing around with it. There's so much I love about it, and I wish more software out there followed similar design philosophy. Anyhow, I hope this isn't something TOO obvious that I'm missing, but... I've enabled the client API to access my images on my phone, and that works fine, but I've noticed there doesn't seem to be a way to restrict what file services are accessible? I don't have a use case for this at the moment, but I was considering sharing a client API with housemates so it seems like having some way to restrict file services via the permissions area would be ideal.
>>17586 You can recreate client.caches.db from nothing, and a last-resort recovery in some damaged database situations is simply to delete client.caches.db and then try booting, but recall how long it took to 'process' the PTR. If you sync with the PTR, then recreating your 20GB or whatever caches file is going to take like eight hours. Best and simplest to keep everything in one piece. Since the only reason to keep a backup is for recovery purposes, it should be tuned for doing that job easily. In an era of 8TB+ USB drives, the inconvenience isn't worth the space saved. I generally recommend just backing up your whole install or db folder. Make it a single atomic transaction to backup and restore. Easy rules that won't go wrong. As for storing the PTR's data in other database files, yeah, I think I might go in that direction eventually. Historically, there was a 'max attached num databases' value of like 5 or 6, I think. We are on 4 right now (client.db is main, then +3 externals, and the +1 client.temp.db that only exists while the client is running), in part because I didn't want to push the limit too hard. I believe the limit is much higher now, like 10, on stock SQLite, and on newer python you can actually talk to that variable and change it to any n you want. Now, I don't want to go up to 20 database files working on the same transaction because that will add overhead, and in the case of the normal WAL transactions, it actually adds annoying desync problems if there is a program crash during a transaction commit, but I think it would be reasonable to separate mappings into 'remote_mappings' and 'local_mappings'. Could do the same with parts of client.master.db, but it wouldn't be so important. I'm sure you know, but on a typical hydrus client that syncs with the PTR, client.mappings.db is like 99.7% PTR and the rest your local tags. If a crash hits and damages your db, it is likely to damage that file since it is the largest, and since the PTR is as you say totally recoverable one way or another, we probably don't want to store the non-easily-recoverable local mappings data alongside the PTR stuff. That said, and pertinent to your original question, client.caches.db stores a copy of the tags for your local files. There's a recovery routine that restores a damaged client.mappings.db by copying from client.caches.db. Duplicates of data don't hurt! Ultimately though, the best strategy against problems remains just maintaining a good backup. Like with client.caches.db, the PTR can be recomputed, but why not spend a little disk space ensuring that if you have a problem, the restoration takes just ten minutes of hard drive work copying files around rather than many many hours of background CPU work? As >>17590 says, I recommend backing up every time before you update. Once a week regardless works well. If I screwed up the update somehow, or if your computer blows up one day, the worst is you are out time and maybe some dosh. You invest hundreds of human hours into a big hydrus install, so insure that time. >>17584 I don't think so. Export folders have the same panel, and they remember those options, so maybe you are thinking of that. I am still overdue a complete overhaul of this panel that will add favourite templates for easier setting up on various hard drive imports. >>17587 Maybe, but recall that tags are for searching, not describing. If we already figure out the tech to recognise and search by colour, we would probably be better off using some custom system:colour tech rather than using human time to add colour tags to search with. Similar to how we can just go 'system:filesize > 200KB' and don't have to mess around with 'very large filesize' style tags. >>17591 I don't really know anything about CivitAI, but do you have to be logged in to see NSFW? Check out Hydrus Companion if you need to copy your browser's login cookies to hydrus: https://gitgud.io/prkc/hydrus-companion
>>17601 Thanks for letting me know; I am glad you are working better. We've had such problems with Wayland, particularly recently, that I am encouraging this for all Linux users going forward. We can try again in a year or so after the next round of Qt and/or Wayland updates, to see if they work better together. >>17569 Great idea. >>17570 Sorry for the trouble, and sorry I missed this before--moreso I rolled out a fix in v617 so you can actually set custom tag import options for the new e621 downloaders. The new downloader doesn't use the 'e621 file page' in its normal operation, so in the 'manage default import options' dialog you will want to assign options to the new 'e621 gallery page' entries. This hack convinced me I should pretty much clear out this whole mess of a system and replace it with a simple domain-based options structure. If I had that, you would not have run into this problem in the first place. >>17571 Are you editing files in-place, as in you are loading, editing, and saving back the jpeg that is stored somewhere in install_dir/db/client_files? I am afraid hydrus is strictly read-only storage, and it will freak out if you edit a file under its control. Although it is not excellent, the solution at the moment is to export the file, edit it, and re-import it. Hydrus will see the new, edited file as a completely new and separate thing. In future we'll likely have automated pipelines with duplicate file detection and metadata merge, but for now it isn't great. >>17611 Yep, once we have better tech and are overall happy with our ability to auto-merge duplicate file metadata, I'd like some retroactive merge. Maybe it would be a big red button you hit to resync the current state, and maybe we have a live sync where if a bad file gets a tag, its better quality duplicate could get the tag too, but that'd probably be much more complicated to pull off. The good news is that hydrus remembers all duplicate relationships, even on deleted files, so we aren't losing any data here. I just need to write the system to do the retroactive merge, and it can happen whenever. >>17619 Here is some of my old help that has been through a couple of conversions and is now is the misc advanced section: https://hydrusnetwork.github.io/hydrus/advanced.html#importing_with_tags I had a bunch of ideas about supporting paged content like comics, but I've never really been happy about how it all shook out. You can sort by creator-chapter-page stuff, and collect-by chapter to make nice chapter style single objects, but it is all hacky and duct-tape still, after years. The 'collect-by' stuff says "ok, if the file has any creator or chapter tags, label the file with those tags and gather it with all the other files that have the same label'. You get thumbnails that represent all the files in your view that are ( 'by dave', 'chapter 3' ) or ( 'by dave', 'chapter 4' ), and so on. Any file sort you apply occurs within collections also, so if you sort by namespaces, particularly 'creator-chapter-page', then your collection thumbs that look like books, making virtual cbr files. The 'unmatched' question asks what to do with files that have none of the respective namespaces, so here if any file had no creator or chapter tags, should we leave those as single files in a big list, or should we compile them into a new 'collection' thumbnail that might have like 257 mixed files in it. I will continue to work here, in fits and starts, and get actual cbr navigation within the client, but for now I recommend Comic Rack or another dedicated reader that will be able to suck up your cbrs or whatever and provide nice reader-specific tools like bookmarks.
>>17621 Thanks. Sure, I will see what I can do. It feels like a simple 'go up one step' is the main missing link here, but we might want some better value scanning tech or something, maybe if only not to be mired in some regex hell. I'll save this and make sure I can write a formula that gets these tags and not the media guy. >>17624 Hell yeah. It does all sorts of HDR tech better too. Some other users were telling me today that Google appear to be prepping to reintegrate it into dev Chrome, perhaps, and Apple are shipping some sort of half support in Safari already. Is it possible JpegXL could win over AVIF and HEIF? I'd love it if so, but we'll see. If the browsers support it, and CDNs host it, the creators can start exporting it and putting it in real product. If phones start capturing it, the deal is done. >>17627 Perhaps, but I can't talk too confidently. The last time I looked at this, I threw my hands up because it suddenly spiraled in complexity. I think it is part of the sibling lookup I do in blacklists, instead of direct tag-to-tag lookups, suddenly I am like scanning the entire sibling space for potential wildcard matches and it isn't working in viable time any more. I may look into it again, but I can't promise anything. Maybe I'll have to fudge the blacklist sibling logic when you have wildcards and tell the user what is going on without it sounding stupid technical. >>17630 Sure, I will see what I can do! >>17634 >>17635 I don't remember changing any of this recently. When you say scrolling in the duplicate window, do you mean the thumbnail view, rather than switching between the pairs of files in the duplicate filter itself? And when you say moving, you find scroll events are skipped/silenced somehow, generally when the mouse is moving when the scroll tick happens? I am afraid to say that when I test here, moving my mouse and scrolling thumbs up and down, things feel correct. I haven't felt it in my IRL client either. I would tell you to try the help->report modes->shortcut report mode, but it is precisely the thumbnail grid's scrolling that is controlled by Qt natively, so that won't report anything. If you also get bad scrolling in the media viewer, you could try that report mode there. Maybe there's an error on 'scroll up/down' events. We might thus blame this on Qt, but we haven't changed the Qt version in a while. My 'it is hydrus's fault' guess here is perhaps that the client is very busy with something like subscriptions (and thus CPU-expensive file imports), and that is somehow blocking the Qt thread, causing it to skip mouse scroll events. This happens rarely during extreme busy-ness with some events, I have seen it, but typically when the program gets busy for 200ms or whatever, the event just gets queued up and processed on a delay, when the UI has its thread back. My 'it is the OS's fault' guess is that some OS or mouse driver update suddenly changed scroll handling in some crazy way. Does this problem happen more when your client/system is busy with other stuff? Is there any chance another program has a global hook on scroll events, idk but maaaaybe AutoHotKey, and there's some wild script consuming scrolls under certain conditions?
>>17643 Ahhhhhhhh! Sorry, I broke it by accident! Fixed for v618. >>17648 Don't feel bad, I only added this a few months ago. And I just broke it by a stupid typo, so check it out next week please! >>17646 Yes, this is a great idea. I don't have good custom-action support, or 'favourite duplicate merge options', but I want to push in this direction as we do duplicates auto-resolution and get more tools and reasons to do more careful merging here. Mapping rich custom objects to a shortcut is always a pain, but I think it should be doable. >>17650 Thanks for letting me know, that's interesting about the Windows tool. I'm glad you figured it out and it isn't the worst possible outcome. For your backup restore situation, if you have done some deleting and importing since the backup, you might like to keep your newer client_files exactly as-is but swap in the older (and more reliable?) backup .db files. Then hit up 'help my media files are broke.txt', again in the 'db' directory, which will walk you through how to resync your older db to the newer file reality. There's basically some clever maintenance tasks that do variants of 'if you think you have a file, but don't actually have it in storage, then delete your record of it'. Think carefully on what you want to happen and then execute. If we are only talking like 10 files difference though, and you have them (or their URLs) and can do the manual re-import easily, then just doing that is probably simpler. >>17651 >What i kinda can't wrap my head around is the 'unnamespaced input gives...' option, what it can offer that you can't have with the other options and why when this option is checked, autocomplete behavior seems to be active even though i have 99 characters and 'always autocomplete' checked off when trying this option. I need concrete examples what you can do with it that you can't with the others, i don't understand the tooltip to be honest. Thanks. Yeah, basically, a million years ago, I made it so any unnamespaced autocomplete search predicate like 'heroname', would not only search for unnamespaced tags, but any namespace (so that search would provide files that had 'series:heroname' or 'character:heroname'. This was default behaviour. Note when I say search predicate, I mean in a file search. This does not govern what autocomplete results you get for putting 'blah' in the tag search. I constantly get these confused when I think and talk about this stuff. When I was overhauling the autocomplete system, I removed this logical quirk as not actually helpful as default and wrote a little label hook for search tags in the form '*:blah' would render as 'blah (any namespace)'. If you want the old search behaviour, you can enter tags like that. Then, once I rolled this out, some guys said 'hey we liked the old behaviour, please add it back', so that's the checkbox. It means that when you type an unnamespaced tag, is the thing you are selecting going to search for unnamespaced, or any namespace. I'll update the tooltip to be more clear.
>>17653 Thank you for your feedback. The 80% and then 'work hard' thing is odd--I don't know why it happened. As we get more rules flying around here, I'll be better able to investigate why it is stopping. Your points about PSD make a lot of sense. I think I have in my notes a similar concern from another user. I think a checkbox that says 'ONLY WORK ON IMAGES' is a good idea, or maybe some similar red text in the UI and help docs is enough. As you say about the suggested rules, I should be very careful they have good rules. I want to take this slowly. Having a custom of slapping 'system:filetype is image' on every suggested rule and every new rule by default is probably the sweet spot, now I think of it. The jpeg/png pixel dupe test was a nice and easy thing, but I now need to write some tools so we can start pushing at these fuzzier situations. Let me know how it all goes for you, and where we discover problems, that's where we need guard rails. >>17655 Yeah, I've been talking about the same thing with the guy who writes the Hydrus Web interface. My API permissions structure sucks, it is way overengineered and not fun to work with. You can limit by tag, but the way that works is unwieldy. Since I've added multiple local file services, we now have this easy way to strictly partition file collections, so they offer an ideal way to shape API viewing permissions. You should absolutely be able to make a local file service called 'share with dave', and then attach his Client API access key to that domain. I can do permission checks on that domain easy, and then you can easily control what he gets to see. I do not know when it will happen, but remind me please if it seems like I forgot about it.
(29.16 MB 2560x1440 scrolling.webm)

>>17659 I'm the second anon, I haven't noticed it in duplicate windows because I use keyboard buttons for those. I have only noticed it in the tag search view. The best way I can describe it is like Hydrus is missing the action, so you need to keep re-scrolling until Hydrus picks it up next time it happens. I wouldn't be surprised if it's a Qt issue, it does seem to happen more when I haven't scrolled in a while, and when Hydrus is busy, but doesn't seem to 100% correlate as I swear I've seen it happen even when Hydrus isn't busy at all. Nothing shows in the shortcut report. I run Linux. Hydrus Qt 6.7.3. I just noticed the other anon said it's more common when moving the mouse while scrolling and that's definitely the case for me too. See the video:
>>17662 Eww, I left black bars in the video.
(899.40 KB 659x823 perkele.png)

>>17659 >scrolling No, it happens literally all the time, that's what makes it so annoying. And I mean literally literally, it's super reproducible by just moving the mouse while trying to scroll (and it only registering scroll events when I completely stop). Duplicate viewer, media viewer, thumbnail view, everywhere in Hydrus and only in Hydrus as of a week ago or so, after updating some random shit (either Hydrus or the system itself, don't remember). It's probably some underlying Linux and Qt issue that's impossible to diagnose, because fuck the entire Linux architecture. I briefly looked at some completely unrelated 7 year old VirtualBox debugging thread that said to install "imwheel", did, and the issue disappeared immediately in all but the duplicate view, where it has a tendency to double-register scroll events, so fuck me I guess. I guess I'll try switching to evdev from libinput. I fucking hate Linux so much.
>>17657 >>17657 >>>17587 >Maybe, but recall that tags are for searching, not describing. If we already figure out the tech to recognise and search by colour, we would probably be better off using some custom system:colour tech rather than using human time to add colour tags to search with. Similar to how we can just go 'system:filesize > 200KB' and don't have to mess around with 'very large filesize' style tags. I am talking about using colors as "search tags", not "suggested tags".
>>17657 >>>17591 >I don't really know anything about CivitAI, but do you have to be logged in to see NSFW? Check out Hydrus Companion if you need to copy your browser's login cookies to hydrus: https://gitgud.io/prkc/hydrus-companion https://github.com/civitai/civitai/wiki/REST-API-Reference#authorization
>>17662 >>17664 just wanted to give my input and say that I'm a Linux user running v617 from source through X-Wayland, and I'm on Qt 6.7.3 with KDE Plasma as my desktop environment. I'm not having this issue. every scroll event appears to register fine. since it's 2 of you, a random mouse error is unlikely, but it seems to work for me, so it's not just because of Linux.
since underscores are so common on boorus that don't allow tags to have spaces, I think a cool thing Hydrus could do (that hopefully won't break any legitimate tags) is to have a simple but specific conversion rule for it, kind of like how it automatically strips leading, trailing, and consecutive spaces from tags. The rule could simply be: >a single underscore between roman-characters (or just ascii if you want to be stricter) and numbers is automatically treated as a single space This rule wouldn't effect leading underscores, trailing underscores, or multiple underscores in a row, since there's a chance that they might be there for a reason, but single underscores between words is almost certainly intended to be an alternative to a space when you're not allowed to use spaces. I know that Hydrus has a display option kind of like this, but that only affects appearance. I'm suggesting actually de-duplicating the tags this way. It could be made into an option if you're unsure, but I think with the underscore rule being very specific, it should be safe to just let it be how the tags in Hydrus work, just like how how spaces work. I'm not sure how big of a change I'm actually asking for here, but it feels like an uncontroversial one to me. idk maybe I'm wrong
How do I change which monitor files open up on? I'm running Linux.
>>17670 Never mind. I just figured it out. I went to file>options>gui Then, I clicked media_viewer and clicked "reset last size" and "reset last position". Then, I clicked "edit", unchecked "start fullscreen", checked "start maximized" and applied it. Then, I opened a picture, moved it where I wanted it, closed it, and then switched back to "start fullscreen".
>>17671 goode image is goode
Can we pass multiple files to external programs somehow? Currently only the focused file get sent over. I mean you can tag then query with the client API but that's quite slow and roundabout for simple tasks. Also is it possible to set a naming scheme for drag & drop?
Hey hydev, could we get a way to add urls on local file import, akin to the quick add for adding tags to all/only selected? Maybe it's too specific, but I just thought of the idea when importing a bunch of pictures I extracted from an archive. Also, while I don't see any valid reason for it, I may as well add notes to my request as well.
>>17674 Drag & drop naming is in file>options>exporting>drag-and-drop export filename pattern, you need the "copy files to temp folder" setting enabled for it to work apparently.
>>17676 Dang I completely forgot that exists. Thank you!
Anyone running into rule34.us SSL cert errors lately?
Any place with the documentation for custom sort by rules? Can't seem to find any, wanted to make a custom sort rule to have creator->serie-> then probably either time when the file was created or time when it was uploaded to a website and then the rest of the gibberish like volume chapter and page.
>>17679 file>options>sort/collect has namespace file sorting and default collection sort settings, is that what you need?
(4.10 KB 596x136 image.png)

>>17682 That's what I was talking about yeah, I just wanted to know if there was documentation for the keywords I can use to create new custom ones here or at least modify the existing ones. What bothers me with some of my images is they aren't always sorted in the "right" order when you look at for example comics, or images with variants. I figured maybe I could make sure it always gets it right by changing the custom sort order to have it sort images by creator, then the serie the image is from, and then the time or date of creation/upload and then other stuff, but I don't know what the keywords to input there would be and can't find info on the manual either, that's what I meant.
Is there a way to always ignore one specific tag in one tag repository as if it wasn't there? Specifically, I never want to see the PTR series:mythology tag because it's e621 trash and actively disrupts series:* searches, but I don't want to ctrl+a, delete all PTR records of it. Same would go for the PTR "anthropomorphism" siblings, which are patently retarded.
>>17686 They're just the namespaced tags anon. If I understand what you want it should be creator-series-import time-other but I don't think you can use import time, I think it has to be a generic tag namespace.
>>17688 Aw, alright then. Thanks anyway.
I'm trying Hydrus on nhentai but it's not downloading. Is it because of a VPN?
Is there anything I can do for a site that returns a status code even when I enter a direct link to an image? I'm trying to make a downloader for whole threads and I get a 469 error code, but even a direct link to an image I get a 469 error code. It seems to always return the homepage html.
Is hydrus still unable to download from 8chan? I guess the API is gone?
>>17696 The splash page was created for multiple purposes, one of which was to prevent someone from posting illegal images, archiving their post, then reporting the archive to said archival site, and repeating this until this site is blacklisted from being archived. However, archives are grabbing images again, so this is no longer prevented, but it still breaks attempts to scrape files such as with hydownloader.
>>17697 Damn, I wish I could at least pull from the API.
(5.17 KB 507x15 16-01:53:56.png)

>>17698 >it works now Either I imported the wrong cookies last time, and I didn't fuck it up the second time, or the admin changed the site just for me. If it's the latter- uh, thanks admin.
I had a good week. I did a mix of small work, fixing some new and old bugs and cleaning code. The release should be as normal tomorrow.
Is there an elegant way to handle different domains for the same site? I noticed that 8chan.se & 8chan.cc don't work in Hydrus, despite being the same as 8chan.moe.
>>17714 I think you need to copy and modify a url class, and add the relevant urls as examples to the parsers, unless either of those reads or prepends domain names.
Hey guys I have a weird issue, I'm not an expert at this whole hydrus thing by any means, but I did try my best to try to figure it out on my own b4 posting here. Here's my problem: a LOT (but not ALL) of coomer.su posts 403-ing despite having sent cookies over. The kicker is... they fail (403) in a DETERMINISTIC way. If one succeeds, it will keep succeeding, and if one fails, it will always keep failing no matter what. I seriously doubt this is a cookies issue. What I found trough extensive testing is that when I click to open a link that failed (403) and open it in browser, as expected, it also 403-s, BUT the url I open in hydrus is NOT the same that pops up in my browser. For example. In hydrus, in my file log, I click on one of them that failed. In hydrus the url displays as: https://coomer.su/data/a7/bc/a7bc747c24357df8a585a366dbd80c71e81ebcfd76ea8263971b4ea276c5c914.jpg but when I open it in my browser I get: https://n1.coomer.su/data/a7/bc/a7bc747c24357df8a585a366dbd80c71e81ebcfd76ea8263971b4ea276c5c914.jpg Notice the "n1." subdomain at the beginning. This sometimes also appears as "n2.", "n3.", or "n4.", seemingly at random. I've found that if I manually change the subdomain of the url of the failed file import randomly between these 4 subdomains, I'll eventually get one that actually works! In the above example if I change "n1.coomer.su/data/a7/..." to "n2.coomer.su/data/a7/...", I no longer get a 403!!!! I have ZERO clue what is going on and hoped that someone here with more knowledge could maybe realize what's going on and help point me in the right direction. For context I am using the coomer.su downloader I imported from the community github found here: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders/Kemono%20%26%20Coomer
Strange issue I've run into with jxls, Hydrus appears to max out my CPU (7700X) when reading them (either during import to generate thumbnails etc, or when browsing full size images). It also somehow takes priority over everything else and causes the PC to stutter, including currently playing audio streams from other programs. Hydrus is set to default normal priority in Task Manager and manually setting it to low fixes this with zero impact on performance. Is there anything that can be done to mitigate this or increase decode performance? I assume we're tied to whatever Python dependency we use? I ask since IrfanView etc appears to be able to decode jxls with much less effort. If it matters at all the images are all encoded at effort level 5-7 and are lossless.
>>17717 I had my fair share of debugging the coomer downloader this week and ended up where you are, though I think the coomer.su/data link should get forwarded to the correct url (so n1 n2 or somthing) What i found out is that '/data' is missing in a lot of URLs for me when I try to import a gallery for example. I edited the parser so that it adds '/data' to the pursuable URL that it finds, and I get a lot less 403s now but still a lot. e.g. this is a 403 for me - parsed by the downloader https://coomer.su/fd/a4/fda40d4ee7c09b34154e204934f1450881df82ea48bb364b08c66903b298aa18.jpg but this is not https://coomer.su/data/fd/a4/fda40d4ee7c09b34154e204934f1450881df82ea48bb364b08c66903b298aa18.jpg Maybe someone smart can fix it.
(7.46 KB 512x132 fixed_coomer.png)

>>17717 >>17720 (me) I did some more tweaking and added a string converter for attachments and primary images, prepending the text '/data' Now it seems to find the correct URLs, I parsed an entire gallery without a single 403 error. I'm not sure if this breaks downloading from kemono, but on coomer it seems to work great now. I attached the post api parser I modified. Could you try it and see if it works?
https://www.youtube.com/watch?v=10YHsVc01IY
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v618/Hydrus.Network.618.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v618/Hydrus.Network.618.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v618/Hydrus.Network.618.-.macOS.-.App.zip No Linux build this week, sorry! If you are a Linux user and want to help me out, there's a test build with instructions here: https://github.com/hydrusnetwork/hydrus/releases/tag/v618-ubuntu-test-01 I had a good week mostly fixing some old and new bugs. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights I accidentally broke the 'force metadata refresh' job (from the right-click->urls menu) last week! Sorry for the trouble, should be fixed now. I overhauled the hydrus IPFS service plugin. It works much more simply now, with less experimental and ancient code (nocopy is now disabled, sorry!), and directory pinning is fixed to use their new API. I disabled the native multihash downloader also, but I think I'll bring it back in some more sophisticated way that uses the normal downloader UI--if you are a long-time IPFS user, let me know what you'd like. Updated help is here: https://hydrusnetwork.github.io/hydrus/ipfs.html I removed a couple crazy system predicate preferences under options->file search. The more-confusing-than-useful one that did 'hide inbox/archive preds if one has count zero' is simply gone, and the one that did 'hide system:everything if client has more than 10k files' is just replaced with a simpler 'show system:everything (default on)'. Thanks for the feedback on these. If you have the 'archived file delete lock' on, the various file maintenance jobs that check if files are missing/invalid now navigate your situation better! If you had file maintenance paused, try turning it back on. next week More bugs to work on, and I want to reserve some time for duplicates auto-resolution work so that doesn't slip.
>>17722 I'm sad to say it doesn't seem to be working for me. I'm not quite sure what downloader component I'm supposed to be looking for. When I import the image, hydrus says it is adding "kemono post api parser" so I tried clicked on "manage parsers" in the downloader components, and tried to compare by eye, but couldn't find a difference. Do note that I am not very good at this whole parser making thing. Of course I also tested this on an actual gallery after I imported the image, but it still kept giving me 403s :( I'm not sure if this is user error on my part, or if what you gave doesn't work, as I cannot really read the imported modified parser to check for sure. Could it possibly be that I need to delete my existing parsers before replacing them with this new one...? I'm sorry I couldn't be of more help.
>>17762 It's the kemono.su post api parser. If you edit it then you should be able to see the components of the parser. This image might be a little chaotic but that's what I modified for the primary and secondary attachment files. Could you send a screenshot of the file list with the 403 errors? I just wanna look at the URLs that it parsed for you.
Hello fellow retards. How do I get watchers for 8chan working? It gets stuck on the disclaimer.
>>18046 You need the cookies from a session that has gone past the ToS. If you use hydrus companion you can send them directly.
>import files with tags via .txt sidecar >tags have underscores in them damn... did hydrus change how it handles underscores? I remember it just treating them as spaces no matter what. I poked around and found the option to display underscores as spaces, but if I turn that on and have one image with 'test tag' and another with 'test_tag' then type test in the search box I get "test tag" twice (they show up as completely different tags). Is there a way to do a one-time zap to turn underscores in tags into spaces?
Is it normal to have a 700mb main db file with 97% free pages which got down to 27mb after vacuuming (which was quite funny ngl)? I have around 50k images. And what does "free pages" mean actually? I only know a little bit about sqlite, that might be sth I haven't gotten around to yet.
(63.69 KB 533x784 1.png)

(27.29 KB 444x537 2.png)

>>17731 >More bugs to work on Hey Hydev, i found some weird behavior which might not be indended, but you have to tell me if it is. So here is the deal. First i need to say that im still in testing phase and my client is set up like this: 1. I added the PTR 2. I have no siblings for 'my tags' and only a handful of parents/children for 'my tags' for testing purposes. 3. In 'manage where tag siblings and parents apply', under the 'my tags' tab in the 'parents application' box, i have 'my tags' and added 'PTR' underneath, so that all the PTR parents/children apply to my tags also. So i tested if the PTR tags applied to my tags on a test file and it worked like it should. Then i checked what the parents submenu on right-clicking the 'studio:dc comics' tag would show. As you can see in the uploaded image (1.png), there are 3 blocks of tag domains: --my tags, PTR-- which contain the most (9 written + 228 more), --PTR-- and --my tags-- have 18 and 23 entries. Now i would assume, that if everything would be correct, everything would be under --my tags, PTR--, so both domains have all the parents/children? Why are there entries in the --PTR-- and --my tags-- blocks at all? I am pretty sure everything was synchronized at that moment, since my client isn't turned off and my PC also not, so it should. But i didn't check at that time tbh. But what i did when i saw those 3 blocks, i right-clicked the tag selection box and chose 'maintenance' -> 'regenerate tag display'. Since the warning box then says, that it would only regenerate for the selected entries and would only take some minutes, i started the process. But unfortunately it started to regenerate everything in the PTR tab in the 'tag sync display' window. I didn't realise from the start, but later when there were around 30k+ parents left to sync. While in the mid of syncing, i checked the parents submenu of studio:dc comics again (check 2.png) and saw that alot of them got swapped over to --my tags--, --PTR-- was gone and --my tags, PTR-- had only few left. I thought it will proceed to swap everything to --my tags-- and thought that it is maybe supposed to be like that, so i waited like 30 hours (!) till the sync was done and checked again. The result was, that it showed me the 3 blocks of entries like in 1.png, so kinda where it all started. So i wanna know: 1. Was me starting the maintenance job doing unnecessary work without changing anything in the end? Yeah i know it says 'WARNING EXPERIMENTAL'. 2. Should there be 3 blocks of entries? Seems wrong i think. Shouldn't be everything under --my tags, PTR-- so everything applies to both? Because if you type in the entries from the --PTR-- block into the 'manage tags' window (my tags tab) of a file, the autocomplete/suggestions are not showing and highlighting the parents like if you type in an entry from the --my tags, PTR-- or --my tags-- blocks. 3. How to fix else if not using this maintenance job? 4. Unrelated to all above: the 'show whole chains' checkbox in the 'manage tag parents' settings seems to not do anything for me, so i don't know what it is for. Can you give a quick tutorial/example what i have to do, in order to see what it actually does when activated?
>>17855 lmao, sorry for not writing earlier, but by the time I saw this, I have managed to already figure this out on my own. I quite literally reproduced all the modifications you made for myself locally. I went in and added the prepend text for the primary and secondary files, just like you did in this post. Funnily I came to this solution on my own, lol. So good news, it DOES work now for me, the only issue still left that I don't understand is, why the downloader png didn't work... In case lurkers or other people in this thread would want access to this as well, yk? Now knowing what I know I went back and deduced what might've been the problem. The kemono.su post api parser that I modified manually was called "kemono.su post api parser", BUT next to this I noticed there were 2 others with a similar name, called "kemono.su post api parser (1)", and "kemono.su post api parser (2)". I remember that I tried importing your png TWICE, and when I went into these two parsers I found that, indeed they were ALREADY modified to prepend "/data" to all primary and secondary files, so it seems to me like the issue was that since I already has a parser with the name "kemono.su api post parser", when I imported your image, the new imported parser got automatically re-named "kemono.su api post parser (1)", and so on. Since both the "coomer.su all-in-one creator lookup", and the "coomer.su onlyfans creator lookup" were both configured internally to use "kemono.su api post parser", the result ended up being exactly the same since the correct, new parser to use would have been "kemono.su api post parser (1)". Since I MANUALLY changed the working behavior of "kemono.su post api parser", the already existing creator look-ups started finally started working correctly. Mystery solved. Not sure what conclusion to draw from this for future generations tho.... maybe delete your existing parser before importing one with the exact same name? I seems strange to me that hydrus chose to rename the newly imported parser instead of overwriting...
>>17662 >>17664 I am sorry for the frustration--that looks like a real pain. Unfortunately, the thumbnail grid is a very 'Qt' widget, in many ways. Although I do handle some scrolling and mouse moving stuff, for the most part the actual grunt work of firing off scroll events and so on is all done through Qt. I hate to abdicate responsibility, but I haven't changed this code in a very long time, so if this has recently changed for you, I have to first assume it is something like a new mouse driver or Window Manager (X, Wayland) update that is now conflicting with Qt event processing. Since you are on 6.7.3, are you on the built release or running from source? If running from source, or happy to try running from source (https://hydrusnetwork.github.io/hydrus/running_from_source.html), could you try rebuilding your venv and choosing the (a)dvanced install, and then selecting the (t)est Qt? This should give you 6.8.2.1 or so. If the behaviour differs or is fixed, then we could write this up as some Qt bug that has since been fixed. If that doesn't fix it, or you cannot run from source, can I ask you to load up a tall multi-column list, maybe something like a subscription file log or or network->data->bandwidth list when set to show all history, and then tell me if this scrolling problem is repeated there? Also, same deal for the normal 'selection tags' taglist, when it is like ten pages tall? The taglist uses similar tech to the thumb grid, where it is a virtual scroll area that I render myself; the multi-column list is completely native Qt. If scrolling is busted in both, that suggests it is Qt (or some application-wide event handling thing I am doing wrong somehow); but if multi-column lists are fine, that suggests my custom rendering tech is eating/merging mouse events. Oh yeah, and can you please do the media viewer too? Just load up a media viewer with like twenty files and see if scrolling through it is smooth if you move the mouse. The media viewer works entirely on my custom shortcut handling routine, no Qt tech, so if that works or doesn't it gives us some more info. >>17669 Yeah, I broadly agree. I was going to explore implementing this through en masse sibling rules, but I scaled back my 'lots of siblings' plans, like for namespace siblings, when I ran into unexpected computational complexity developing the new system. You are talking about replace, and I think that's the correct place for this--a hard replace where I basically write a routine that says 'delete the underscore tag, add the prettier one', rather than virtualised and undoable siblings. I am still gearing up plans for hard replace, but it will come. It should be much simpler than siblings and parents, pretty much just a clever dialog like 'migrate tags' that fires off bigass async jobs with some BACKUP FIRST red text all over it. >>17674 I'd still like to do this natively sometime. Some way of saying 'open these externally' and pass a list of paths to the external exe you want. >>17675 I'd love to. Current plan is to revamp that whole dialog to use newer string processing/conversion tech, and then integrate sidecar tech better. Sidecars work technically good and do URLs and notes and stuff, but they are a fucking nightmare to actually use because the UI is hell. If you haven't tried them yet but feel brave, have a poke around the 'sidecars' tab and look here: https://hydrusnetwork.github.io/hydrus/advanced_sidecars.html
>>17689 Sorry for the limitation here. I'm hoping to expose the secondary sort on normal search pages in future, so you'll be able to do two-stage sorting a bit easier, but I can't think about supporting super clever crunchy sort logic like you want yet. >>17678 CDNs will sometimes do this to effect rangebans I think. I think they terminate the SSL handshake in some rude way in lieu of a proper http response, in order to save resources? Or maybe bamboozle spiders? If you are on a VPN, maybe try changing region? >>17687 Try right-clicking the tag and then saying hide->(tag) from here. There are finer-grained options under tags->manage tag display and search (which that menu option dumbly populates). It won't remove the tag from manage tags dialogs, but it'll remove it from normal views. >>17694 I don't think I've ever heard of a 469 before--it sounds like a joke, maybe? You might like to try putting in the URL in the manage parsers test panel to see if there is any body to the message (it might have an error text or something), or help->debug->network actions->fetch a url, which does basically the same thing. If the site doesn't like you hitting up direct URLs, it is probably a cookie or referrer issue. You might like to try fetching the same URL in your browser with dev mode on and see the network traffic. Is a referral URL sent when you do normal browsing, and is it something hydrus isn't/could be replicating? If you do the fetch in a private browser window (i.e. no cookies), do you get the same 469? Does it have any extra info in the response headers? >>17714 >>17716 In future I want URL Classes to support multiple domains. There's some technical pain in the ass stuff I have to do to make it work behind the scenes, but there are several situations where the same engine or site itself runs on multiple domains and you end up having to spam with the current system. I had this when I did the e621, e6ai, e926 stuff recently. Relatedly, I'd like some better en masse URL renaming or redirecting or normalising tech. It'd be nice to handle domain name changes or merging legacy http -> https duplicates with a clever dialog or URL Class settings that would just say 'ok take all these weird urls and rename them in this clever way to this nicer format'.
>>17718 I noticed this too with some of my test files, that larger ones would eat a lot of CPU and memory, but it wasn't as bad as you have seen. That sucks if it is hitching your whole system. This is the library we use for it: https://github.com/Isotr0py/pillow-jpegxl-plugin It provides encode/decode and registers with Pillow as a plugin. It is in Rust, I think a python wrapper around this https://github.com/inflation/jpegxl-rs , which is bindings on libjxl itself. My assumption was the extra memory and CPU was due to the higher level languages here, and particularly because this is new tech and it just hasn't been optimised yet. I could see how the lossless decoder is shit simply because there haven't been many real world use cases yet. My plan was to wait and assume that new versions (and, hopefully, an eventual native plugin integration into Pillow) would simply improve performance. EDIT: I tried generating an effort 6 fairly large lossless jpegxl, and while I didn't get audio hitching, it hitched a youtube video in the background while loading. That sucks! Unfortunately, I am not sure if I can do much about this. I am planning to re-engineer my render pipeline to handle large images better, but new libraries like this may not even benefit from that. I think I have to assume that this library has been written with some aggressive thread yielding or whatever because they made it for command line encoding or something, and it hasn't been smoothed out yet. Wait and see, is probably the correct strategy. Hydrus is set up to get the latest version of this library every week, so fingers crossed performance improves in future. >>18162 Sorry to say I don't think I have changed the behaviour here, and I don't think I've had underscore-replacement tech in sidecars, so my guess is your incoming tags were nicer quality somehow before, or you had string processing steps to clean them up manually? Underscores have been a bit of a problem for a long time. I've been foolish in trying to support as much as possible when it comes to allowed tag characters and so on. I do want to have a big red button that merges them all to whitespace equivalents, and options to massage all incoming tags with rules like 'ditch all underscores according to these patterns' to stop future instances. I just wrote about it a bit earlier here >>18230 in response to >>17669 . Not yet, but I want it in future. >>18188 I wrote like three replies to this with long-winded technical bullshit but I'm still not sure exactly what happened so I'm just going to make it simple. Might be I have a sync calculation bug in my code, and I will look into it. Your actual questions: 1) Might have been, but if there is a logical problem in my sync code, this was the correct way to fix it. I do not know why it appeared to resync everything in your PTR--it should just do everything that is connected to 'DC'. Now, DC might have tens of thousands of connections, so maybe this triggered 20-60% of all the sibling rules (I'm sure pokemon is connected to DC some way, and that to a hundred other things). 2) Might be a presentation problem--there's some technical weirdness in the program between the 'actual' sync and the 'ideal' sync, and that menu currently shows the actual, which I think it shouldn't. Regarding the three blocks part--do you have any/many DC-based siblings or parents in your 'my tags', or are all these tag relationsh coming from the PTR? If your 'my tags' disagrees with the PTR (for instance, if it has 'catwoman' as a child, probably because of some sibling mapping disagreement), then I will recognise that the three domains do not share all the same ideal and inferred tags. If your 'manage where siblings and parents apply' rules just said 'PTR' on both 'my tags' and 'PTR', or indeed if they said 'my tags, then PTR', then eventually these two lists would harmonise completely, but if the two services differ, then putting 'my tags' in there will cause different lists. Most importantly, the UI here is shit and I can do better. I will think about this. 3) Not easily, and only with bigger commands that truly trigger a 100% complete resync of everything. 4) Basically if you put in 'series:batman', the dialog is supposed to load all the direct descendants (character:batman) and ancestors (studio:dc). But dc has children in other branches, let's say superman, and character:batman may have other parents, let's say 'superhero' or something. Those 'cousins', and all the very complicated nest of tertiary and n-ary branches that spill out from that and probably connect to pokemon and azur lane one way or another, may be useful to see, or they may be spam. The checkbox is supposed to show that, but the dialog's pair-based view is still a horrible way to look at the directed graphs here. Thank you for this feedback. I will do a bit of poking around here and see if I can do some quick improvements to the UI and so on to make things clearer and help our future debugging for this stuff. I'll look into the 'show whole chains' thing too.
>>18172 That is very unusual, but if you had recently deleted a LOT of data, it would be normal. Think of a SQLite database file like a hard drive that can grow in size but not shrink. The file is split into many 'pages' (like 1-4KB each or so), just like your disk, and there's some metadata like a page bitmap that says which pages are in use and which are not. When you delete data, if the page no longer has anything in it, it is added to the list of free pages. When we need to write extra data, if it doesn't fit into an existing page, SQLite then asks itself if it has a free page or if it should expand the filesize to give itself more free pages. It is a little more complicated than this (afaik it doesn't edit pages in-place but instead writes new valid data, updates page pointers, and adds out-of-date pages to the freelist), but that's the basic idea. Vacuum basically says 'ok, create an entirely new database file and write all the data from the old file in a super efficient way to the new file, filling up pages as much as possible on this first write run. Not only do we create a new file with no free pages (since no deletes have occured), but all the pages are pretty much maximally filled since they haven't been edited and shuffled about. It is similar to a defrag, with the additional bonus that any free space is truncated. If your client.db lost 97% of its internal stuff, that's a lot of shit that's gone! client.db tends to store pretty important stuff that doesn't get deleted or recalculated. Like you can delete a file, but for every 'current files' row that is deleted, a 'deleted files' row is added. and stuff like basic file metadata is never deleted. client.caches.db will increase or decrease like 20% if you add or remove a file service, but can you remember deleting anything particular from your client recently? I can't even think what it would be. Some potentially bloaty things like notes are stored in client.db, I'm pretty sure, but there's no way to delete the master records atm iirc. Maybe parents and siblings, with the PTR, somehow? Did you recently delete the PTR? Otherwise, have you recently had database damage? I assume everything feels normal, if you haven't noticed like giant amounts of metadata missing from your collection. If you have never had a big meaty database with a lot of metadata, I suppose it could be that SQLite (by which, presumably, I mean me) used 650MB of client.db space for some pseudo-temporary storage, but I can't think what it would be. If you happen to have an old backup before any big delete you did, you might like to download your respective 'command line tools' from here https://sqlite.org/download.html and then run sqlite_analyzer on your old and new client.db files. It isn't a big deal though, but if you want to poke around more, and learn more about SQLite, that tool produces nice document reports on size of tables and stuff. If everything seems fine, you are good. The pages are free, so you can do your vacuum and keep using the client. If your db bloats up again, let me know. Let me know how you get on regardless! EDIT: Oh wait, subscriptions or GUI Sessions, maybe? Did you clear out a giganto search recently?
(342.88 KB 1243x630 2025-04-20 06_10_21-.jpg)

>>18264 Thanks that's a lot of useful info. I don't think I did any major deletions recently, or ever. I'm usually pretty paranoid about losing/messing up data so yeah. Does hydrus write these things in the logs? I don't keep gigantic search pages around for long either (I like to keep things snappy and responsive). And from looking at my backup history, the file has been roughly the same size for at least 2 months. I don't use the PTR either. The only major thing that I can think of is moving my entire 150gb media library to another location. Don't know if that could be it. Anyway I will pull my backup and compare them. Might be a good chance to practice some more sqlite as I've been getting into it recently. Will keep you posted if I find anything interesting.
(8.08 KB 307x138 Screenshot (269).png)

An idea I had to deal with galleries that have these high image counts. For gallery downloaders, I kind of had the idea of setting some kind of threshold that automatically makes a 2nd or more query entry, like a part 2, part 3, and so on if a certain threshold of images is reached. Say in the gallery downloader, users would set it to make a new query after 1000 images is reached, something like >query [artist] >once it reaches 1,000 images >query [artist part 2] is made starting with the 1,001th image and so forth This would then keep Hydrus from killing itself trying to open 10k images when you go to check on some random downloaded gallery. I guess my idea is to set up options to break up large work into smaller piece. I'd rather deal with >query artist 2k images >query artist part 2 2k images >query artist part 3 2k images than >query artist 6k images And I kind of wonder if this can be done with tabs as well when searching for something with a certain threshold set ie, searching for a tag with over 1,000 images or search + limit set to 1,000 with a threshold set to 500 per tab >threshold set to open 500 at a time >make 2 tabs, 500 each where it focuses on 1 tab at a time I wonder if that would also help reduce any strain on the program as well.
(16.75 KB 669x343 3.png)

(32.92 KB 669x819 4.png)

(73.34 KB 579x726 5.png)

>>18260 >Now, DC might have tens of thousands of connections, so maybe this triggered 20-60% of all the sibling rules (I'm sure pokemon is connected to DC some way, and that to a hundred other things). The PTR has 35,342 pairs, the 'manage tag parents' windows tells me. When i noticed that my SSD was working hard, i checked the sync status, which was around 94-95% and said '30k+ parents to sync'. Later i took the screenshot (3.png). It says only 'parents', no 'siblings'. So i THINK it pretty much resynced all parents/children. >>18260 >2) Might be a presentation problem--there's some technical weirdness in the program between the 'actual' sync and the 'ideal' sync, and that menu currently shows the actual, which I think it shouldn't. But don't we want the 'actual' synced ones to show? Because the tags that are only shown in the --PTR-- block don't apply any parents/children when entered in the 'my tags' domain on any file. At least like that we know there is a sync problem. But im not sure what you mean with 'ideal sync' either, so you would know better :D >>18260 >Regarding the three blocks part--do you have any/many DC-based siblings or parents in your 'my tags', or are all these tag relationsh coming from the PTR? If your 'my tags' disagrees with the PTR (for instance, if it has 'catwoman' as a child, probably because of some sibling mapping disagreement), then I will recognise that the three domains do not share all the same ideal and inferred tags. I did only have a handful of not DC-related parents like 'test1/test2' in 'my tags' and no siblings, when i encountered the 3 blocks. Later i added/created a handful of related stuff (test stuff like studio:dc comics 2/dc comics 3/dc comics 4), but it didn't change anyhing regarding the 3 blocks. Now I deleted everything from the 'my tags' parents dialog and it still shows me the 3 blocks. So now i have no siblings and parents, and the 'manage where siblings and parents apply' dialog looks like seen on '4.png'. The PTR tab has 'PTR' in both boxes, i didn't change anything there. Also i checked other tags with many parents like 'series:star wars' or 'studio:marvel', but also normal blue tags that have 20+ parents, they often have those 3 blocks and the most are always in the --my tags, PTR-- block--, but for 'pokémon' 2/3 are in the --PTR-- block (5.png) weirdly enough. So every tag from that block would not apply the parents/children when i would enter them in the 'my tags' domain. - Sync status for 'my tags' says '52,640 rules, all synced!'. Tho i am not 100% sure if it had any rules here before i started the regenerate tag display maintenance job. Also i don't know why its 50k+ rules, and not 35,342, which are the number of PTR pairs which apply to 'my tags' now. I guess rules and pairs are not the same? - Sync status for 'PTR' says 611,995 rules, all synced!'. 35k parent pairs + 558k sibling pairs don't equal 611k rules, so i guess here also you can't count them together to get the number of rules right? But '3.png' would say otherwise -> 584,477 rules applied + 26,667 parents to sync = 611,144 (rules). Is anyone here that has parents set up like i did in '4.png'? If yes, could you then check if the parents submenu on a tag shows you also different blocks of domains like in '5.png'? Thx! >>18260 >The checkbox is supposed to show that, but the dialog's pair-based view is still a horrible way to look at the directed graphs here. Well turns out that the 'show all pairs' checkbox is not supposed to be active when doing that lol. Now it works. I thought it would filter the whole 35k list to a smaller one. And of course you have to have some tag entered in one of the boxes below as the tooltip says. Maybe you can make it so, that 'show whole chains' only can get activated, when 'show all pairs' is deactivated? That would only have pros (like helping confused people like me) and no cons, right? Thanks for your help!
>>18223 Glad to hear you solved it, and yeah that importing is weird, maybe it should show a prompt about overwriting the existing or creating a new parser... not sure. There is a repository for downloaders (that's where i got the original one from), maybe we should upload the fixed one or something. Though I haven't checked if this modified version works with kemono or not, it's possible that their api is different than coomer's or something
>>18447 >maybe we should upload the fixed one Oh most definitely, I haven't even thought about that. Good catch! >Though I haven't checked if this modified version works with kemono or not, it's possible that their api is different than coomer's Coomer's is a sister-site to kemono, as far as I know it is run by the SAME people. Would only make sense they use the same tech and api, no? I'm pretty sure that this is the entire reason coomer doesn't have a seperate parser in the first place, and why both kemono and coomer was merged into this one parser to begin with. They're so similar that it just made sense, otherwise you'd just be making a redundant copy with the exact same behavior, but for a different domain. But even so, we should definitely test it doesn't bork kemono first, just in case.
>>18436 (Dead) There have been some changes, read >>17731 > removed a couple crazy system predicate preferences under options->file search. The more-confusing-than-useful one that did 'hide inbox/archive preds if one has count zero' is simply gone, and the one that did 'hide system:everything if client has more than 10k files' is just replaced with a simpler 'show system:everything (default on)'. Thanks for the feedback on these.
Would it be too much work to add a predicate for every file with a note containing a certain search string? e.g. I'd like to find every work with a note mentioning "commission".
I havent used Kemono for a couple months since it and some other sites were unavailable without a vpn for most of scandinavia im pretty sure, but now that i can access it again it only grabs files that arent jpgs and pngs. It downloads zip files, mp4s, psd files and some others just fine but completely ignores any pngs on the same posts and ignores posts with just pngs and jpgs entirely. Im sure this has been brought up before but i could not find this exact issue in previous threads but i might have just missed it.
>>18668 hello fellow nord There are no default parsers for kemono or its cousins. Have you tried this one? https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders/Kemono%20%26%20Coomer If it doesn't work either, I'm going to be updating my own custom parser, probably today. I can share it then.
>>18670 Tried it and some other fixes people posted in previous threads yeah. None have worked so far, so your custom one would be much appreciated.
>>18668 it works fine for me, but my downloader for it is heavily customized. I have no idea what the Cuddlebear one looks like, or if it's even been updated since I last used it. my guess is that your downloader is using an old api url
>>18685 The one in the repository only works partially for some posts. >>18672 I'll post the components later once I'm happy with having ironed out various use cases.
(7.10 KB 512x193 kemono 2025-04-21.png)

>>18672 >>18690 ok idk how to get the gallery URL class/parser to produce the next "page" outside a gallery downloader page (the docs talk about "automatic next gallery page rules" but I can't find this anywhere), so I'm not sure if the components for parsing user pages will actually work as fully intended. anyway, should parse posts just fine. feel free to share any post URLs that don't parse as expected Kemono's API is quite accessible if you want to dabble in creating your own components, by the way. (on the by, these are in fact based on the outdated components found in the github repository)
>>18766 Works fine for me now, thank you very much.
I'd like a shortcut for inverting the current selection of files (select files not selected). ctrl + i, maybe?
Is ATF gone for good?
I had a good week. I fixed several bugs, cleaned some jank unicode characters out of tags, improved quality of life, and added new tools to duplicates auto-resolution. The Linux test build last week went well, so that should be back too. The release should be as normal tomorrow.
How good is this software for when you inevitably need to migrate your collection? I'm worried about having to move over to another PC or maybe swapping from Windows to Linux. Would it be feasible to move the collection, fully set up from one to another?
>>19114 Pretty easy and simple with maybe a few minor extra steps. Your database folder(called "db") inside the hydrus folder is where all your images, tags, and settings are so whatever installation method you use(exe, portable, linux, mac), won't matter so long as your db(database) folder is still intact. You can delete the entire hydrus folder except the "db" folder and still keep all your stuff and settings. Often times you may have to do this if you want to do a clean install.
>>19154 Thanks. I'll start using it for boorus. With this + lanraragi for doujin I only need something for voice works and I'll be done with sorting.
Is it not recommended to modify the db files directly? Something like mass renaming tags/namespaces or replacing some characters. Does hydrus cache data somehow or does it pull everything from db files on every startup?
https://www.youtube.com/watch?v=lawkc3jH3ws
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v619/Hydrus.Network.619.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v619/Hydrus.Network.619.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v619/Hydrus.Network.619.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v619/Hydrus.Network.619.-.Linux.-.Executable.tar.zst I had a good week. There's a mix of several sorts of work, and duplicates auto-resolution gets more tools. Your client is going to clean its tags on update. If you have a lot of tags (e.g. you sync with the PTR), it will take twenty minutes or more complete. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html Linux build The Linux build had a problem last week at the last minute. Github retired the old runner, and I missed the news. I have rolled out a test build that uses 22.04 instead of 20.04, and several users report that the build boots, does not seem to need a clean install, and may even fix some things. If you use the Linux build, please update as normal this week. If the build does not boot, please try doing a clean install, as here: https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#clean_installs If today's release does not work at all, not even a fresh extract to your desktop, please let me know the details. If this happens to you, you might like to consider running from source, which solves many Linux OS-compatibility problems: https://hydrusnetwork.github.io/hydrus/running_from_source.html I've now got a job to check for runner news in future, so this sudden break shouldn't happen again. misc work AVIF rendering broke last week, sorry! Should be fixed now, but let me know if it sometimes still fails for you. I updated the tag-cleaning filter to remove many weird unicode characters like 'zero-width space' that slips in usually because of a bad decode or parse or copy-paste transliteration. On update, your client will scan all of its tags for 'invalid tags', renaming anything bad that it finds. If you sync with the PTR, this will take at least twenty minutes and will likely discover 30,000+ invalid tags--don't worry too much about it. If you want to see exactly what it found, it logs everything. If you use sidecars for export, I moved the hardcoded 'sort results before sending them out' job to the string processor that's actually in each sidecar. Every sidecar will get this new processing step on update. They work as they did before, but if you do want the results sorted in a particular different way, you can now change it. duplicates auto-resolution I had success adding more tools to duplicates auto-resolution. You can now do "A has at least 2x the num_pixels as B" comparisons for some basic metadata types, and also say "A and B have/do not have the same filetype". I have enabled all the UI and added two new suggested rules for clearing out some pixel-perfect duplicates. If you have been following along, please check these out and let me know what you think. I do not recommend going crazy here, but if you are semi-automatic, I guess you can try anything for fun. Odd bug I've just noticed while playing around: sometimes, after editing existing rules, the list stops updating numbers for that edited rule. Closing and opening a new duplicates processing page fixes it. I'll fix it properly for next week. Next step, so we can push beyond pixel-perfect duplicates, is to figure out a rich similarity-measuring tool that lets us automatically differentiate alternates from duplicates. I'm thinking about it! next week I might try this 'A is > 99.7% similar to B' tech, for duplicates auto-resolution. I've got some IRL that might impact my work schedule in a couple weeks, so I'll otherwise just do some small jobs.
Edited last time by hydrus_dev on 04/23/2025 (Wed) 22:08:17.
>>20211 Yeah, generally speaking, I'd say writing edits to the database is pretty dangerous and I would not suggest it. If you change some tag texts in client.master.db, you'd also have to change some cached lookup and FTS values in client.caches.db and then if you are thinking about merging duplicates after the rename you are suddenly in a swamp. Feel free to poke around though. Use any SQLite tool you like, and I'm happy to answer questions. SQLite Studio is good if you just want to browse. If you see something really simple to change, or if you feel brave enough to edit the json options structure, say, then you can try, but the most important rule, as always, is to make a backup beforehand. If you make a backup, then you can go crazy, and if it all goes wrong, no worries. If you want to rename namespaces, that's tricky. If you are willing to put work in, I'd recommend using the Client API instead. Do mass tag delete/replace jobs instead, and then you are using my code to handle all the caching and counting logic adjustments for you. https://hydrusnetwork.github.io/hydrus/client_api.html
Edited last time by hydrus_dev on 04/23/2025 (Wed) 22:31:44.
Is it possible to put one file-service's files in a location? i.e. put pdfs in My Documents/hydrus and pics in Pictures/hydrus?
>>20561 No, not with a single database at least. You could run two different instances of Hydrus. One for pics and another for documents. This would accomplish what you want at the cost of needing to run two clients and maintain two databases. For details see the section on database migration in the help files (Help --> Help and getting started guide) Please note, hydrus does not preserve filenames. You'd end up with every file being an SHA1 hash. So you would have to use hydrus to find pdf files. It would not be human navigable. If you're just holding on to trying to organize files by hand, this probably isn't worth it. Remember: search, don't sort. If you're wanting to for example keep all your personal stuff separate from your porn collection, then running two different instances would be worth considering. It really just depends on exactly what you're trying to accomplish by separating where different filetypes are stored.
Having a rule searching inside a certain distance for non-pixel-perfect pairs with the same height and width seems to be working well for finding alternate pairs. I feel like I have to ask the dumb question, though: The auto-resolution system searches only for files that haven't been processed for duplicates already, right?
>>20377 Maybe I am too retarded now, but the HTML does not seem to come through to the subsidiary page parser window in the parser editor.
>>20678 That's about 618.
>>20377 >automatically differentiate alternates from duplicates. The most annoying duplicates to me are WebP images from Alibaba sites. There are always JPEG and WebP versions, and they have different artifacts (one ruins sharp corners, and the other ruins flat areas), and sometimes watermarks, and JPEG versions from different sites with different watermarks.
Does anyone have an NHNB parser?
A couple of notes for the duplicate auto-resolution system: * Rule renaming isn't working * I'd rather not be sent back to the top of the list when I approve or deny the pending action for a pair
>>19049 what do you mean by ATF?? I only know of one ATF and I'm pretty sure it's not what you're thinking of.
>>20984 The booru of alcohol, tobacco and firearms. It stopped working when they added that antiddos thing.
>>21018 >alcohol, tobacco and firearms And explosives. Everyone forgets that bit.
>>20377 >Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html >actual vs ideal tags Not sure if this is supposed to change anything regarding my 3 blocks sync problem, but this update unfortunately didn't seem to have changed aynthing regarding that. Still looks like before. Just wanted to let you know. >updated the /add_tags/get_siblings_and_parents help to discuss that it fetches actual rather than ideal tags Where exactly can that help be found? Client, Website or somewhere in the Hydrus folder? I couldn't find it.
>>21107 >>21018 that's..... a weirdly specific booru...
>>18287 My best guess is you had a big session for a while, some big download, and the multiple backups of that increased the size over time. If the download page changed frequently (most do), then you get a bunch of older copies hanging around the page memory, but there's a routine that clears them out if there isn't a primary saved session saying they want them. Odd that your db seems to be increasing in size right up until recently and then now says 90-odd% free. Could be lucky timing, but I wonder if SQLite can sometimes lazily just give itself more disk space rather than consult its own free pages list sometimes? I wouldn't expect that, but these things can be complicated. >The only major thing that I can think of is moving my entire 150gb media library to another location. Don't know if that could be it. Nah--the only thing that changes here is a little table that just lists location->directory prefix. If you run sqlite_analyze, my guess here suggests you'd see a bloat in the tables 'json_dumps_names' and 'json_dumps_hashed'. Let me know if you discover anything! >>18400 Not a bad idea! I've always resisted pagination in the program, but things really do get strained at like 4k+ numbers. I recently made the 'selection tags' box only calculate tags for like the first 4k in view, and it worked out great tbh. It is always simpler to just say 'ok make the user wait twenty seconds and then dump a 13k result on them', but maybe I should figure out a nice widget to simply define a file range for all these sorts of things and then let the user optionally set that gauge to what they want. >>18420 Thanks, this is interesting. I can't give a really clear answer that exactly explains what you are seeing, but in v619 I did change that menu to show the 'ideal', which may or may not help in future. I don't know how to nicely communicate the differences here, but I may just do a db hit, and if actual/ideal are not synced, I'll put a label up saying so, but talking about this logic that runs on 20% black magic anyway is an ongoing problem. The difference between actual and ideal is that actual shows what is currently computed in the display tag cache, and it is what you see in the GUI when you click on a file. If you add a new parent for A->B, or 50,000 such rules, the actual display tags that you see do not change immediately because it takes a lot of CPU to do so. My display sync routine does little bits of work in the background, slowly recalculating what you 'actually' see to what you 'ideally' see according to all the rules. A rule, iirc, is basically 'If A exists on the storage domain, B should (not) exist on the display domain'. It is the simplest algebra in the system, and a sibling or parent can be expressed this way. A parent adds "If A exists, B should exist". A sibling adds "If A exists, A should not exist" + "If A exists, B should exist". My syncing routine basically calculates the rules that currently and should exist and migrates them one at a time until actual and ideal are aligned. Under good conditions, the user only notices it as seeing tags update three seconds after they ok changes on the manage siblings/parents dialogs. Siblings and parents interact, which _may_ explain your unusual numbers here. If your rules say that "A is sibling to B" but also that "A has B parent", then the system has to navigate that contradiction (it does this by collapsing all tags to ideal siblings first, and then any A->A parents get ignored). The PTR has all sorts of bullshit knots and loops from over the years, and if you mix it with other rules from your "my tags", I guess the numbers can get whack. But I don't know for sure. Since the PTR parents are generally for most users predicated on the PTR siblings, you might have luck adding the PTR siblings to your "my tags". That might connect some odd loops together. >Maybe you can make it so, that 'show whole chains' only can get activated, when 'show all pairs' is deactivated? Great idea, thanks.
>>18667 Yeah, I keep meaning to do this. It keeps getting put off, I'm afraid. The database has fast text search already prepped for notes; I just need to write the predicate and some UI for it. >>18995 Thanks, I will see what I can do. You can already set a bunch of 'select files' commands to the 'thumbnails' set, but it doesn't look like I support 'invert' yet. I'll look at how it does it and try to expand the list. >>20194 Detailed info here if it helps: https://hydrusnetwork.github.io/hydrus/database_migration.html Short answer is it is easy to move stuff around, no worries. Let me know if you find anything confusing. >>20561 >>20601 I expect to have 'put a file with x property in y location' tech in the medium term future. The new Metadata Conditional object that I have introduced in the duplicates auto-resolution system will help this. I need to write some better file storage tech to do 'a file can be in multiple locations' and 'let's move this one file to a better location', too, and then fingers crossed it will all knit together. >>20601 Just fyi, it is SHA256. Same basic thing as SHA1, but a longer hash. >>20653 Thanks, good to know. >The auto-resolution system searches only for files that haven't been processed for duplicates already, right? Yeah it searches the same 'potential duplicate pairs' as the rest of the duplicates filter. If you look at your rule list, it should say in the 'progress' column something like '64,540 did not match the search'. That plus failed test count plus any pending count should be roughly the total pending pairs count if you set your duplicate filter to 'system:everything' and like distance 16. The number will differ a little, probably, because of some legacy pairs, but that's what it is searching. If you manually add or dissolve potential pairs, the rules' numbers should change. Importantly, when you set a duplicate or alternate or false positive pair, the 'potential pair' is dissolved, so it is removed from the system. In complicated cases, this can transitively dissolve other potential pairs. Each pair can only ever be actioned by one rule. >>20678 >>20679 Thanks, I noticed something similar the other day. I will double-check my test data pipeline here.
>>20717 Thanks, that's a good example I hadn't thought of. Can you think of what the sort of rules would look like for the jpeg/webp eliminator? Let's assume the user in this case always wants the jpeg. search: A: is jpeg of reasonable size B: is webp of reasonable size comparators: A is similar enough to B to be considered a duplicate A is jpeg A is bigger filesize than webp (probably true in 97% of cases, but let's be safe) action: Set A better, delete B I presume A would pretty much always be bigger than the webp. We'll have to see how the IRL data shakes out here as I test out this new automatic alternate/duplicate differentiator. I feel like small encoding artifacts are easy to detect, but I probably can't differentiate watermarks from 'legit' alternate costume changes etc... If we can catch a good percentage of resizes and 'optimisations' anything like automatically, though, I'll be happy. >>20931 >Rule renaming isn't working Thanks, I fucked up the list refresh after dialog ok somehow. Opening a new duplicates page seems to refresh it generally. I'll fix it. >I'd rather not be sent back to the top of the list when I approve or deny the pending action for a pair Thanks, I'll figure it out! >>21218 Ah, damn, thanks anyway. I'll keep poking around on my end to try and make this clearer to users and better reveal actual logical problems I still have in the system. "Tag Relationships Apply In A Complicated Way" box here: https://hydrusnetwork.github.io/hydrus/developer_api.html#add_tags_get_siblings_and_parents
got this error message when I was messing around with auto dupes: v619, linux, source AttributeError 'NoneType' object has no attribute 'GetSummary' File "/home/user/hydrus/hydrus git/hydrus/client/gui/widgets/ClientGUICommon.py", line 316, in EventButton self._func( *self._args, **self._kwargs ) File "/home/user/hydrus/hydrus git/hydrus/client/gui/lists/ClientGUIListBoxes.py", line 375, in _Add self._AddData( data ) File "/home/user/hydrus/hydrus git/hydrus/client/gui/lists/ClientGUIListBoxes.py", line 396, in _AddData pretty_data = self._data_to_pretty_callable( data ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/hydrus/hydrus git/hydrus/client/gui/duplicates/ClientGUIDuplicatesAutoResolution.py", line 809, in _PairComparatorToPretty return pair_comparator.GetSummary() ^^^^^^^^^^^^^^^^^^^^^^^^^^ I don't know what went wrong though
>>21265 >>>20717 >Thanks, that's a good example I hadn't thought of. Can you think of what the sort of rules would look like for the jpeg/webp eliminator? Let's assume the user in this case always wants the jpeg. The problem is that I cannot decide which one is better. It is only easy when the size of one is twice as big, but sometimes it's just 2 KB different. Maybe the watermarked photo was converted jpeg->webp->jpeg. Or maybe there is a camera that saves WebP, or it was converted to WebP from raw. So they are neither same quality duplicates (where I can theoretically delete either automatically), nor usual alternates (which are not duplicates at all).
>>21350 >>21265 Although, it is only because of potential use with neural networks that I keep such files. >>21350
holy shit you added that file-property comparator system I suggested already?! I thought that was gonna be a few months from now. You're the man! any chance you could add known urls to the "A match this" "B matches this" part of it? the use-case is that I want the worse dupe to have no known urls, but the better one could either have them or not. If I add that to the search with "both match different searches" then it also lets A not have known urls, because it tries the searches both ways around. I want the one that will be the worse dupe to definitely have no known urls, and it looks like it'd need to be a comparator test for that to happen. (also this one's a smaller unimportant request but it'd be cool to have this comparator system "ported" to the normal duplicate filter)
When in the duplicate filter, I sometimes want to check the displayed potential duplicate for duplicates. It would be nice to be able to add it to the search from there when simple.
I installed the software, and am skimming the documentation. I have to manually tag everything?
How do I set up 8chan parsing again? >updated to v619 >opened 8chan.moe and cleared the disclaimer >made a post >exported cookies using "get cookies.txt locally" >drag txt file onto "review session cookies" >Added 7 cookies! >drag a png into window >Looks like HTML -- maybe the client needs to be taught how to parse this? also thank you for this software. I'm at 33GB archived (91.94%) and it's been a lifesaver from slow windows thumbnails and "multi" character folders
If it's just your own booru software + image tagger, I'm unconvinced it's revolutionary
>libbz2-a273e504.so.1.0.6 Huh doesn't this library have known CVEs? Why is it being published 2 years after vulnerability disclosure? https://github.com/opencv/opencv-python/issues/702
>>21562 No, but beware that it doesn't have a good way to save exact file names, unless you write them into sidecar files and then read into notes in hydrus, and later export the notes into sidecar files and rename the files using your own script.
>>21562 Depends where you get the files from. If downloading from a booru there's usually tag parsing, so you will get the booru tags, but it depends on the parser available, a site with no tags or just a file url likely won't get any tags. If you have the time and a free ssd you can use the Public Tag Repository (PTR) but if you keep a small collection you might not want to bother, it takes a good while to completely sync. The PTR may get tags in that scenario, but then it depends on if someone decided to tag it by hand.
That file history chart is pretty cool, but is there any chance you could add a "minimum date" or "since" cutoff for it. My client is pretty old so showing the entire history makes it difficult to see smaller changes, because the x-axis is so squished. being able to set it to something like "since 1 year ago" or "since 2023" or something like that would help to make it easier to read, especially since I'm not really paying attention to the earlier years anyway, so it's just noise to me.
>>21665 You can do that. Use the "system:time" setting to show the file history either before or since your designated import, modified, last viewed, or archived times.
>>21659 Hmm. I might use the software if something like WD14 tagger for comfyUI was integrated. That was I can "mine" tags from my images instead of manually tagging.
>>21676 Don't know about ComfyUI, but there is a WD14 tagger for Automatic1111 at https://github.com/67372a/stable-diffusion-webui-wd14-tagger Just have it save tags to txt file and re-import files using the tag txt files as sidecars.Works very well, but does take a little manual work.
(6.99 KB 512x193 kemono 2025-04-28.png)

>>18766 Updated my kemono parser, because notes weren't being created with the text of posts. Also converted the subsidiary page parsers sub-components to proper content parsers; Didn't see a good reason to keep them as the former.
>>21676 It's not integrated but there is this: https://github.com/Garbevoir/wd-e621-hydrus-tagger I've used it and it works well, but I find it a little too much hassle personally.
>>21659 My experience with the PTR is that it's a giant mess and should only be used where absolutely necessary . Tons of files with heaps of irrelevant tags, all because someone at some point pulled them from a random e(x)hentai image collection (no, I don't fucking care that this one artwork of Saber was in an image collection of works by Afrobull, Tracer isn't in this goddamn artwork) And then the jannies (doing it all for free, bless their hearts!) get pissy when anybody tries to tidy things up in their little internet fiefdom, because of some obscure rule or convention mentioned in a Discord conversation from 2018, so they ban your client's service id, which is of no consequence because you can just regenerate the id whenever you want. Honestly, you're probably better off with an AI tagger for quality descriptive tags.
>>21682 I haven't had nearly as much heartache as you're describing with the PTR so I don't mind it. I like getting a file from somewhere random like a catbox upload and having it already come with tags. I especially like it when I get a file that's been deleted off a booru and due to the PTR it still has booru tags, it makes my life much easier. I find a lot of mistags on old files though, thankfully it seems most people have figured out PTR tagging nowadays. I'm sure there were some teething issues, especially in regards to the duplicate filter, some of the tag combinations I've seen are just bizarre, like 1girl and 6+girls tags on the same file. I also agree with the "tag for 1 file in a collection" issue, Pixiv also has that and it can be annoying, except it's worse because there's no presence rules. I don't do descriptive tagging as I don't have the time or patience, so I usually just put character, creator, series, and rating namespaces. That is usually enough to find a single work. Personal tags like a favorite or the ratings services can also help in the search.
>>21627 >No, but beware that it doesn't have a good way to save exact file names, unless you write them into sidecar files and then read into notes in hydrus, and later export the notes into sidecar files and rename the files using your own script. Can you explain what you mean with 'exact file names'? When you import your files, you can 'add tags with the import' and chose to save filenames and directories to a chosen namespace with the checkboxes on the right. So a file with the name 'sunshine.jpg' gets the tag 'filename:sunshine' for example. It's missing the '.jpg', do you mean that?
>>21687 "Sun Shine*.jpg": uppercase lost, multiple spaces collapsed, asterisk not allowed.
Is there a way to produce a list of local tags in an easier format to skim through? Something like a text file? I keep forgetting my previous work like a retard every time I get the urge to continue tagging and scrolling through the selection tags window is getting less and less convenient.
I was skeptical at first because the software is complicated and the instructions are extremely verbose, but I see the value now. Running your own booru has huge potential. I used imgbrd-grabber but that can only download bulk files. This stores tags and makes them accessible. Super impressive! >>21678 Seems like a big limitation for non-booru media. If this could implement WD14 image tagging + OCR for text, it would be perfect for all meme management. If I could search for specific text in the image or recognized concepts. Manual tagging kills its use for non pretagged media IMO.
>>21665 >>21675 I thought someone would respond with that. no that's not what that does. that excludes files based on import time, which means it completely throws off the numbers. I want the results to be exactly the same, but just only showing part of it based on date range I set, instead of it always showing the entire history of the database and being squished
>>21734 > the instructions are extremely verbose I think that's a great thing personally. Better to over explain than under IMO. Hydrus is deceptively complex, but that's a good thing. >imgbrd-grabber Not familiar with that, but Hydrus has a ton of import options baked in with a highly configurable system to create your own importers. If your site isn't among the defaults though check out https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts Lots of scripts there. But also check out Hydrus Companion. It's a browser extension that lets you send items/urls/cookies to Hydrus straight from your browser. https://gitgud.io/prkc/hydrus-companion gallery-dl is also handy to have sometimes for those weird edge cases. https://github.com/mikf/gallery-dl >manual tagging OCR isn't something I need personally so I've never looked into it. But if you can find something to output the OCR results to either .txt of .json then it's a simple matter of importing with sidecars to get what you want. WD14 via Automatic1111 would be the same. Tags get exported as .txt and you simply import with sidecars. You caneven import both tags and OCR at the same time. Just use something like <file>.tags.txt and <file>.ocr.txt then import both sidecars simultaneously. Worst case scenario you just import twice, once for tags, once for OCR. Sidecars really are the catch all for putting any sort of data into Hydrus.The settings for sidecar importing are quite robust. As long as you have the raw data you can import just about anything in any way you want. There has been some talk about Hydev implementing some form of offline AI tagging based off the massive amount of tagging data in the PTR. So that's a thing that may or may not happen one day. There are a few FOSS OCR libraries out there that might be able to be integrated into Hydrus. Maybe Hydev will consider it, but at least for now, sidecars are the way to go.
>>17427 >>17438 >>17482 hello. sorry about it taking so long, but I finally updated and tested out that "do not allow combined set geometry" option. unfortunately I still get a crash by zooming in, even with the option enabled. it looks like I get a different error message though. python3.12: ../src/x11/x11-window.c:2245: eplX11SwapBuffers: Assertion `sharedPixmap->status == BUFFER_STATUS_IDLE' failed. I don't really know what else to add except that it didn't happen prior to updating from v611 to v614, but it looks like it still happens when I enable to option that you made to revert it, so I don't really know what's going on. Maybe you'll know what that different error message means. Hopefully it's something that's fixable by you. I updating to v619 to test this, if that helps at all.
>>21759 >but it looks like it still happens when I enable to option that you made to revert it to revert the mpv change you made, I mean
>>21759 >>21760 okay this might actually help you! I tested again with the debug option disabled, and I got the same error as before you added the option. python3.12: ../src/x11/x11-window.c:2344: eplX11SwapBuffers: Assertion `pwin->current_back->status == BUFFER_STATUS_IDLE' failed. that suggests to me that the debug option you added did actually work, but there's another issue that's also causing the same bug and the same crash.
>>21757 Very cool and TY for the response. Some things I've noticed. >Images default to "scale to fit" but videos do not. >audio doesn't work on gnome+fedora (tried mute/unmute and increasing volume levels) Gnome volume mixer doesn't show the program outputting audio. Could be me because fedora is really fucky with media codecs. I'll mess with it more tonight.
I had a great week. I figured out animated webp frame timings (so they are no longer all set to 12fps), added some new JSON parsing tech, and fixed some things in duplicates auto-resolution. There are also some more UI improvements by a user. The release should be as normal tomorrow. There will be a 'future build' to test a better AVIF solution.
Got it fully working with mpv. The (Ubuntu?) binary build doesn't work on fedora. Gives an error about /lib64/libgio-2.0 missing, and only breaks mpv integration. Lib is in the hydrus folder but fedora doesn't see it. I'm running the server binary and client from source lol. >>21802 Thank you very much dev-sama! I'll probably stop using web boorus thanks to this software :^)
>>20194 >lanraragi I wish hydrus handled comics/doujins better or some other alt program. Lanraragi is pretty alright but I wish for something that isn't browser focused. Hydrus ironically has the perfect UI for comics with some needed touch ups.
Could you add an option to return files deleted from the duplicate filter to the inbox, so that when you have that "don't delete archived files from trash" option enabled, you won't be stuck with a bunch of duplicates in the trash?
https://www.youtube.com/watch?v=qM2qMPbTR6c
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v620a/Hydrus.Network.620a.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v620a/Hydrus.Network.620a.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v620a/Hydrus.Network.620a.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v620a/Hydrus.Network.620a.-.Linux.-.Executable.tar.zst Hey, I broke the manual duplicate filter in the initial v620 release on Wednesday evening. The v620a links above are now (Thursday afternoon) to a hotfix that fixes this. Thank you for the reports, and sorry for the trouble! I had a great week. There's some fixes, some quality of life, and a bit of new tech. For advanced users, there is a future build to test out better AVIF rendering here: https://github.com/hydrusnetwork/hydrus/releases/tag/v620-future-02 Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights Thanks to a user, we have some more UI quality of life improvements. The options dialog now remembers the last page you were looking at; the media viewer can now save its position on exit (useful if you use multiple); the media viewer can now be dragged around even when frameless; you can now fit the media viewer to the size of the current media; and you can put 'all my files' or 'all local files' at the top of the page selector mini-dialog (useful if you have a _lot_ of local file services). Also, you can map these new 'resize frame to media' commands, and new 'zoom to x%' commands, in the 'media viewers - all' shortcuts set. I figured out a webp parser, and we now have correct frame timings for animated webps (they were previously locked to a 12fps fallback). All your animated webps will be scheduled for a metadata regen on update, and then they should render correct. I added a couple of rules to JSON parsing formula: you can now test the string values of variables you parse, and you can now 'walk back up ancestors', like in the HTML parser. This means you can now filter the existence or content of a particular key or value in an Object or List and then walk back up and parse something else. duplicates auto-resolution I have fixed an important bug that happened when renaming rules. If you renamed some rules in the past weeks and noticed they somehow didn't always stick, you'll get a popup on update about it. The affected rules will be paused and may roll back to a previous version. Please check they are named correct and have the options you want before resuming them to semi-automatic or automatic. Auto-resolution rules were also interrupting idle mode; I think I've fixed it. I had a lot of success working on 'A is an exact match duplicate of B' comparison tech. I am not ready to plug it in yet, but I wrote a prototype that does some image-tile histogram comparison stats and it works to differentiate resizes/re-encodes from even minor alternates, at least on a small test scale. It needs to render both images, so it takes about 1 second to run. I am going to plug it into the manual duplicate filter as a comparison statement, and we'll tune it for wider IRL examples, and then I'll improve the auto-resolution UI to better handle laggy comparisons. I'm feeling a lot better about this--there's more to do, but it doesn't seem impossible. next week I may have some IRL stuff happening next week, and it is possible it will tie me up for a while, so there may not be a release for a week or two. I'll post updates when I know more. Otherwise, I'll push on this new 'A is an exact match duplicate of B' tech in the manual duplicate filter, and if the AVIF future build goes ok, that'll be folded in as well.
Edited last time by hydrus_dev on 05/01/2025 (Thu) 20:44:22.
>>21915 I'm getting errors which eventually lead to a hard crash on this version (v620, Windows 10). This only happens when using the duplicate filter: v620, 2025-04-30 21:15:21: QBackingStore::endPaint() called with active painter; did you forget to destroy it or call QPainter::end() on it? v620, 2025-04-30 21:15:21: Uncaught exception: v620, 2025-04-30 21:15:21: AttributeError 'MediaResult' object has no attribute 'GetMediaResult' File "hydrus\client\gui\canvas\ClientGUICanvas.py", line 3115, in _PrefetchNeighbours
>Jump from 614 to 620 >Updating through 619 >Scanned for and fixed 3,800 bad tags Bit curious about this.
>>21924 he added tech to strip several kinds of invisible character from tags. since those tags are no longer valid, fixing them just means stripping those characters from tags already in your db
>>21915 >>21923 yup me too. I'm on Linux running from source
>>21832 Support already seems good. Import doujin.zip and open the .cbz externally. Gnome photo viewer treats the .cbz as a gallery. Only improvements I can think are opening .cbz internally, and also downloading tags from ehentai/nhentai. Scraping those sites would be shitty because of all the duplicates.
>>17634 >scrolling As an addendum, this kind of resolved itself. The scrolling problem appeared in multiple other programs (but not all of them) and kind of resolved itself with another system update, and I have no idea how or why. I still hate bleeding edge linux distributions.
>>21932 >I'm on Linux running from source Try running the venv with different options for Qt before updating.
>>21923 Me too
>>21968 but I didn't update the venv, so how would that help?
>>21923 >>21932 >>21968 >>21984 >>21981 Sorry, I fucked something up! Fixing and testing now, will have a hotfix up within an hour. I'll edit and replace all the links in the release post above when ready. Thank you for reporting.
>>22032 I pulled and it seems to work now. thanks!
(49.46 KB 1230x355 6.png)

>>21260 >Since the PTR parents are generally for most users predicated on the PTR siblings, you might have luck adding the PTR siblings to your "my tags". That might connect some odd loops together. Good news! I tried that and this fixed the 3 block problem. It really seems, that parents rely on siblings alot regarding this display problem. So after i put the PTR siblings under 'my tags', a new sync started, with all siblings and ca. 1/4 of all parents again, as you can see in '6.png'. Before that i made sure everything was synced 100%. So i also assume, that you can see your 'actual vs ideal' update from v619 in action here in the screenshot, since you can see that all children/parents were in one --my tags, PTR-- block, even though alot of sync work wasn't done yet. For other tags, that was also the case. After sync everything is still in that 1 block, which is great. In v618 you probably would see tags from the other 2 blocks get gradually moved to the correct one, right? I have some questions still: 1. When i did the sync for parents back then, it synced like all parents and took 30+ hours iirc. Siblings sync was done overnight. But can you tell me, if for example i start a sync like that by putting PTR siblings under 'my tags' but then change my mind after 5 minutes, is that process reversible as quick by deleting PTR siblings from the 'my tags' siblings box? I mean, only the processed tags get processed again and everything should be like before after another 5 minutes? 2. Lets say i put a child tag on a file in the 'my tags' domain, that tag gets stored in the storage domain, but the parents are not, they just get displayed because they are just in some display domain, correct? If i delete 'PTR' from the 'manage where parents and siblings apply...' settings, all the parents in the chain will not be displayed anymore (in case i have no 'my tags' parents), only the child tag, right? I would kinda lose all the parents. - In general, is there a way to mass apply the displayed, but not stored tags, to be stored aswell? Kinda 'burn' them in, so you could theoretically delete siblings/parents and still see all the tags? Not sure what all the positive or negative points would be when you do that though. Maybe that would increase the size of the DBs alot? Positive might be, that you can delete other tag domains from 'manage where parents and siblings apply...' or tag domains from 'manage services' without fear that many tags aren't displayed anymore if 'my tags' for example rely on parents/siblings from other domains. Not sure if that would be very useful, so maybe that's more of a theoretical question. 3. If adding PTR siblings to 'my tags' had not solved the issue, I thought maybe it would be better to mass migrate (-> add) parents from PTR to my tags from time to time. Would that have any downsides, except that i'd be behind some days to the PTR (depending how often i do that)? And do you think there would be the same sync/loop tag display problems, because siblings are needed too, or not because this time parents would really be in 'my tags' in the manage parents dialog? 4. Right-click in tag selection -> experimental. The 'multiple media view tags' is the display domain from the 'tag filter for multiple file views' from 'manage tag display and search...' settings, right? Is the 'display tags' tag display for single file views (media viewer) then? I could not verify that by doing a quick test, so i wonder what it is for. It shows the same as the first one. Can you give a quick explanation? The 'stored tags' tag display is clear. It shows what you actually entered and not the siblings/parents.
Tweaks that would be nice: >tag files with source domain >tag files with source domain's user score (if available) >tag files with publishing date on source domain It helps sort by where you are getting files, what users like of that artist, and how the artist has progressed over time.
Also capturing image pools so they get added to relevant image sets. Might be a lot harder though.
>>22091 See the items in the network menu.
Getting a consistent crash in the duplicate processor when trying to view the second file in a comparison. If I enter the duplicate processor and then immediately leave it, there's a notification about this error occurring. v620, win32, frozen AttributeError 'MediaResult' object has no attribute 'GetMediaResult' File "hydrus\client\gui\canvas\ClientGUICanvas.py", line 3115, in _PrefetchNeighbours try: These are the files being compared. The new one is larger objectively better and the smaller older one causes the crash in the duplicate processor, so I'm just going to mark it as better and delete the other one. I can open both files just fine in the thumb and full media viewers.
That one uv anon from the previous thread here; thanks for keeping the pyproject.toml updated! Really appreciated, I only use this now and it makes installing and upgrading a breeze.
>>22114 Danke. I only skimmed earlier and thought is was a Linux issue they were discussing.
hey, hydev, i think your png color rendering is slightly off. gimp says her outfit should be #3C3C41 (measured in multiple spots, came up all the same), but a screenshot of hydrus shows it rendered as #3E3E43.
>>22156 actually it may not be exclusively a rendering problem, files that should probably be pixel perfect duplicates aren't counted as such. This jxl should be exactly the same as the PNG. (poking at it in gimp suggests this is probably the case) on catbox because 8chan doesn't like the future king of image formats: https://files.catbox.moe/rovuti.jxl
>>21334 Thanks, I saw this earlier, and should be fixed in v820. I missed a case when cancelling the dialog that picks the type of comparator. No damage done. >>21350 >>21366 Yeah, feels like the decisions are easier when the filetypes are the same, and whatever happens, we are going to have a chunk of human decisions to make for alternates and trickier dupes. Personally I'd probably prefer jpegs over webps in all safe-ish 'jpeg is bigger/older'-style cases. While I'm not confident I can do much automation for alternates or watermark stuff, I somehow had 150k+ pixel dupes pairs in my system using the three suggested rules, which is so much human filter time saved I'm basically laughing. If I can crack easy resizes and re-encodes for the same filetype, and that's another hundred thousand, or maybe many more, I'll take a victory and not worry about complicated stuff for a while. >>21423 >holy shit you added that file-property comparator system I suggested already?! Slowly getting there. Had a few productive weeks, so I've been kicking things out. I had huge success last week figuring out the basic algorithm to do 'A is a resize or re-encode of B', excluding artist corrections, watermarks, or recolours, but I need to do more testing/tuning. I'm really happy I did the infrastructure work here for so long. Adding new tools is simple. I'll add some system:url and system:number of urls stuff to the comparators! And yeah, the long term plan here is to replace all the 'comparison statements' you see in the duplicate filter right-hand hover with the new comparator tech, and some associated scoring system. Slowly getting there--keep letting me know how it works for you as it all rolls out. >>21525 Yeah. I'd like to display this in all media viewers at some point. At the moment, the media objects that back thumbnails don't 'know' or get updated about duplicate data, but I'll add this at some point and then I won't have to do secret database hits every time you open up a duplicate menu etc... and I'll be able to add a danbooru-like 'this file has two parents' kind of UI to things like the media viewer. >>21572 I'm not totally sure on the 8chan cookie situation, but I think the cookies.txt is enough, as >>18048 says. If you do an import and you get the 'looks like html' error, that probably means your cookies did not copy across in some way since hydrus is getting the TOS page instead of the file. The name of the click-through cookie changes every day or so, I think? So you have to keep syncing it either with cookies.txt or Hydrus Companion. Make sure the domain matches, if you are crossing .moe and .se. Some sites (usually when backed by CloudFlare) need the same User-Agent in hydrus as the browser, but I don't think that's true of here.
>>22178 >the long term plan here is to replace all the 'comparison statements' you see in the duplicate filter right-hand hover with the new comparator tech, and some associated scoring system Ahh, that'd be great! one of my long-time wishes for Hydrus is to have a way to intricately define scoring yourself, to make the scores more trust-worthy and thus lower the mentral burden of going through the duplicate filter. An important one of the top of my head would be comparing known urls, since there's certain domains that frequently produce good files and others that frequently produce bad ones. it'd be cool to encode that knowledge into the scoring. anyway, very exciting to hear that a feature like that is coming at some point!
>>21619 Thank you very much for this report. I check the github dependabot alerts regularly, but this one was not reported there. I do not import bz2 in my code or package it explicitly. I have looked through the Linux build logs, and it looks like 'aggdraw', which is required by psd-tools, includes it in a subdir. There's another version in the basedir that is 1.0.8. I just now made a test build of v620 using Ubuntu 24.04 (instead of 22.04), but that still had it. Then I did so with Python 3.12 as well (up from 3.11), and it was still in there. So, this suggests something in there is pulling in 1.0.6 explicitly by version. I now ran the thing with some debug stuff and yep we get a bit of this: 2025-05-03T20:53:25.7854515Z 49432 DEBUG: Processing dependency, name: 'libbz2-a273e504.so.1.0.6', resolved path: '/opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/aggdraw.libs/libbz2-a273e504.so.1.0.6' So aggdraw is specifically asking for that bad version. I guess it includes it in its old wheel somehow. The macOS and Windows builds do not pull in a bz2 lib in an aggdraw subdir, so I am not sure how needed it is. Maybe it is for when an archive is bundled in a psd? I'm not sure if we would be vulnerable to this CVE, but I don't want to mess about, I want it gone. So, in any case, we want a fix for now. I have a test build here that simply removes the bad .so file from the build: https://github.com/hydrusnetwork/hydrus/releases/tag/v620-ubuntu-2404-06 I don't have a Linux testing machine right now, so can you test that for me? That build should be the same as normal v620 but without the bad file. If you extract it to your desktop, does it boot? If it boots, does help->about, 'optional libraries', say that 'psd_tools' is ok? If so, I'll make this the rule going forward for the Linux build.
>>21665 Yeah I would like to do this! Mine looks stupid too. It won't be super difficult I hope; I just have to get it done. >>21728 Not really, I'm afraid. Figuring out what a booru might call a 'tag wiki' is a long-term project I want to plan out one day. Hydrus just doesn't have good ways of saying 'show me a list of all the tags in xx domain, sorted by blah'. I also can't hack some 'oh just export it to clipboard' or similar tech since the numbers of tags we tend to deal with make it not a great user experience. You can hack a similar thing in the autocomplete tag search by allowing '*' wildcard searches and then ctrl+a -> ctrl+c, but it is a mess to actually deal with. I need some proper UI and search tech to make it good. >>21759 >>21760 >>21761 Thanks. Sorry for the trouble. I think I agree that you are now(?) being hit by something else, or at least a different expression of the same error. Maybe your window manager updated. Can you try running hydrus with each of these environment variables in turn: QT_QPA_PLATFORM=xcb QT_QUICK_BACKEND=software QT_OPENGL=software (if you are on nvidia) __GL_YIELD="USLEEP" And maaaybe a combination of them? It looks like the error is because I am screwing around with bitmap data before it has had a chance to flush to screen. Seems like I am being given the opportunity to resize a window before a paint event has finished, which generally speaking shouldn't happen (when I am in a 'resize window because user just did a shortcut' event, all the paint stuff is supposed to be clear). It is entirely possible my code is still doing something stupid, but it could also be a bug or something in your GPU driver, and if setting the backend to (slow, stable) 'software' clears the error, that might be it. >>21819 Well done for getting mpv working. I'm increasingly recommending all Linux users just run from source now. The build is a bit duct-taped. If you find you still enjoy hydrus after two to four weeks, I'd love to know how you found learning everything. Keeping the help up to date for new users is a constant battle.
>>21924 >>21928 Yeah check here if you want to know more >>20377. Check your 'client - date.log' in the db dir to see what it actually renamed. It is probably unicode garbage from years ago. >>21955 Yeah internal cbz navigation in the media viewer would be great. One day we'll have it. A lot of the 'file relationships' tech I have planned for after duplicates auto-resolution will actually work in this. Also proper cbr, cb7 support. >>22059 Thank you, this is really interesting. I guess siblings really are a core part of the chain. 1) Yeah, if you only do five minutes of 'do' sync, then the 'undo' needed should be about the same amount of time. 2) Yeah, when I reworked siblings and parents a while ago, the main objective was to make them 100% 'virtual' and undoable. If a parent relationship disappears, it is as if it never was added in the first place. The storage domain is untouched by sibling or parent data. There's no 'burn in' job yet, but there will be in some fashion. I generally call it 'hard replace'. The old siblings/parents system was an awful mix of soft and hard replace, hence why I wanted to rewrite it, but now that we have no hard replace, I want to bring it back in a more careful and user-driven way. It will be able to do en masse tag renames, including for namespaces. I do not want to make a big red button to 'hard apply' siblings and parents, because the current undo tech is very very useful and sane for all sorts of reasons, but I'd like the ability to bake in certain problem fixes and perhaps namespace renames. It'll launch on the PTR first probably, where the janitors are waiting for me to figure this out. 3) Yeah there would probably be the same loop problems because of lacking siblings. I guess when users are editing parents in the PTR, the dialogs are collapsing siblings to their ideals behind the scenes and in the workflows explictly and they don't notice that they are taking advantage of their chains being connected together by siblings. Having just siblings or just parents may not be so useful an option to have available because of this. e.g. A sib B A parent C On a service that only sees parents, B won't have C. On one that sees siblings and parents A becomes B, and all Bs have C. This will be easier to visualise when I finally figure out a graph drawing widget for tag relationships chains. 4) Yeah you have it right. I gave them dumb technical names, but the 'multiple views' is the 'selection tags' box on a file search page, but the 'single views' is the media viewer tag box. Both are downstream from the base display tags domain. When you say 'hide this tag from here', it affects either the selection tags or media viewer tags. I don't like how the workflow worked out here, it is inhuman.
>>22091 For 'publishing date on source domain', try sorting by 'time: modified time'. I record a modified date for the sites you download from, and most parsers pull a decent value for the 'post time' for it, too. For that modified time sort, I use the earliest value of all the known domain times and the disk file modified time. it can't do it yet, but I'd like to update that sort box to let you select the specific domain too, so you can sort as the files are on a particular booru etc.. For 'files with source domain', try 'system:url'. You can search for whether a file has a particular domain, and URL searching can be slow. It is a bit clunky, but it should work. User score is difficult. There's a way or two to get that in hydrus, but not a way to pipe it into a hydrus-side rating. One of the problems with online ratings is they change often, also, so you have to deal with conflicts in a cleverer way on any subsequent parse that happens. >>22094 I don't have good pool tech yet. Hopefully a future version of file relationships will support this nicely. >>22120 Great! I'm going to convert all the requirements.txts over to use it when I pull myself together. >>22156 Thank you for this report. I grabbed both files and I agree you can see just with your eyes that the lightness or something is very slightly different between the two when viewed in hydrus. Doesn't seem like the png has an ICC profile either. I will examine what is going on here. I do some bullshit processing on every file load to normalise everything to sRGB for technical ease, so maybe I am quantizing something in the wrong way. >>22180 Yeah exactly. Completely user-configurable, rather than my hardcoded stuff. We'll see how it goes.
>>22193 I'm >>22091 Subbed to your patreon. This is really amazing software and I think long term it can replace file managers. Not just manage booru images, but documents, memes, music, and projects. You can't tell a file manager "show me all my 1980s music" unless you already manually made a folder already. Filesystem metadata is literally 30 year old tech. >>22187 There's a library location issue on Ubuntu vs Fedora. Let me spin up a VM and test for you. >>22193 >tagging with website I ended up going to Network>downloaders>default import options and manually added a website:<site> tag for each downloader. Works well enough. >sort by time: modified time Works good, even if it's technically not the right data. TY. >>22191 >polished .cbz support Amazing and I can't wait. .cbz support is also a great way to get more users from the manga/doujin community. >>22193 >pool tech It's basically going to require .cbz support already completed to work properly. Pools are booru's shitty implementation of manga sites.
>>22187 Tested on Ubuntu 24.04 LTS >boots fine >help>about>optional libraries >psd_tools = True
>ATF downloaders broken by anti-DDOS measures >Begin checking former ATF subscriptions manually once a month, not too bad >Something is off this month though, two or three artists in a row with large batches of mixed new and old images recently uploaded >Check the uploader >Gelbooru Importer, likely something scraping posts from gelbooru >Adds anywhere from 5-10 pages of shit I have to go through manually for a couple dozen artists on ATF Fuck Gelbooru Importer.
>>22191 >Yeah you have it right. I gave them dumb technical names, but the 'multiple views' is the 'selection tags' box on a file search page, but the 'single views' is the media viewer tag box. Both are downstream from the base display tags domain. When you say 'hide this tag from here', it affects either the selection tags or media viewer tags. Thanks for answering. For this i was specifically refering to the second entry in the right-click 'experimental' menu. The 'display tags' tag display seems kinda buggy. Yes, the tooltip says 'may not work!' for all of the 3 modes (let's call them that for a moment). The first and third modes work though, the second one not in my testings, at least i don't know what this mode is supposed to show. So maybe there is potential for a fix. You say the second one is supposed to be connected to the filter/blacklist for single file views/media viewer? I can not verify that. That's why i was wondering what this mode is exactly for. Please bear with me, now it gets bit confusing. My findings: 1. 'Hide' from right click is only visible when activating the switch to 'multiple media view tags' tags display mode, not visible in the second and third modes switch 'to 'display/stored tags' tag display. So there is no way to verify if 'display tags' mode (from the experimental right click menu) = 'single file view tags' (from manage tag display and search...) filter/blacklist. 2. Opening a file in the media viewer and checking the taglist there, a right-click -> 'experimental' doesn't show a checkmark infront one of the 3 modes, which also doesn't show me which one is actually the default there. So i guess none of those 3 are the default, thats why there is no checkmark, and doing tests by putting a 'test' tag into the 'single file view' tag blacklist (by going into 'manage tag display and search...' options), does verify that the experimental 'display tags' tag display mode, would unhide the 'test' tag again, which it shouldn't, if it would be the same as the single file tag blacklist. Once the 'test' tag is unhidden, there is no way to hide it again while being in the media viewer by changing the modes, since the 'display tags' tag display isn't responsible for that it seems. It makes sense that the 'multiple media view tags' and 'stored tags' modes are not responsible for that though. 3. We know that if you 'hide' a tag from the thumbnail view tag selection (which only works for the first mode), the tag gets a blacklist entry in 'manage tag display and search...' -> 'tag filter for multiple file views'. If you open a file in the media viewer and don't change modes yet, it gets into the single file view blacklist. So once the 'test' tag is hidden in the media viewer, and you start changing the modes, they all unhide it, and you cannot go back to the default media viewer tag display again except you close the file and open the media viewer again. Thats why it seems the default media viewer mode is none of the 3 'experimental' modes, as i said before. So the question remains, what is the second mode "switch to 'display tags' tag display" for? Was it meant to be connected to the media viewer blacklist and be the default in the media viewer? If so, can you fix it if it's not too much work and maybe rename it, maybe to "switch to 'single media view tags' tag display. That would help me with not being confused every time i see it, but of course only if there is actually a fix needed and my assumption are correct. Thanks!
>>22208 Their mods said it will be done in a few days or so.
>>22219 >Both are downstream from the base display tags domain Oh i wasn't paying attention, here you name that domain. So the second mode probably has some sense behind it, but i have no idea. Maybe add an 'single media view tags' tag display to the experimental modes if the second mode isn't supposed to be that, as i was thinking. But some points still stand i think.
>>22219 Holy shit bro just type big boob and get 3 million results.
>4 off
>>22187 >>22200 >>22194 Thanks for looking into this, but this issue is now moot. I decided to replace all the psd_tools stuff with manual file parsing and an ffmpeg rendering solution today. psd_tools is removed from the requirements.txts and thus this aggdraw library shouldn't be in v621 and thereafter. The library provided a lot of tools, but we only needed to fetch dimensions, icc profile existence, and pull a render preview.
Files on imageboards often have names referring to their origin: the image id on a booru. Has anybody made anything good to turn them into a list of booru links without switching more than two contexts?
>>22156 >>22158 Ok, following up on this, when I debug-mode load these files in hydrus, this happens: JXL: Load file. It has an ICC Profile. Before I apply the ICC, the pixel at 544,1352 is 3e3e43 After applying the ICC, the pixel is 3c3c41 PNG: Load file. It has no ICC profile. The pixel at 544, 1352 is 3e3e43 Now, here is the kicker: When I load the files in gimp, I get the same results! 3c3c41 for the JXL but 3e3e43 for the png! When I import the JXL, it specifically mentions it has to do some colour adjustment because of an existing ICC profile in the image. The png just loads straight. When I load the files in qView and then copy/paste a screenshot to gimp, both are 3c3c41! I thought I was going crazy, so I did all the tests again, and then updated gimp, and gimp 3.0, when I import the png, now talks about 'Generated RGB profile from PNG's gAMA (gamma 2.2000) and cHRM chunks', which some research tells me are like an ICC profile but just different. So, it looks like the file is '43 raw but has colour adjustment stuff that moves it to '41, and in the png case I am not supporting it correct. I've queued up a job to investigate this and see what I can figure out. Fingers crossed I can load this data like an ICC Profile and just tell PIL to apply it in the same way. I'm sort of surprised it doesn't do it already on file load, but if gimp wasn't supporting it a year or two ago, I guess it isn't such a super huge thing. Thank you for an interesting report!
I had an ok week. I fixed some bugs, did some duplicates work, and cleaned up some build and environment issues. There's also some more user work for ratings, with many more rating shapes to choose, resizable ratings, and a rating svg experiment. The release should be as normal tomorrow.
>>22401 >many more rating shapes to choose Why didn't I ask for that?
Any ways to keep image "text overlay" translations some websites like gelbooru/danboruu etc have when saving an image? Like even the translations going to the right side description or something.
>>22404 It looks like notes won't load it you have Javascript disabled, so I don't think there's a way to do it from the post page. you might be able to do it by heading to the note history page, and grabbing the translations from there, but I forgot if Hydrus has a way for you to fetch some content, then send it back "upwards" to a previous context. I think a subsidiary parser might be able to do this if you wanna give it a shot.
>>22401 oh I thought there wasn't gonna be a release this week. pleasant surprise
>>22425 Yeah I had an IRL shit sandwich to eat, and it could have spiralled into a thing, but thankfully it just ate half of Monday. Should be normal releases for another four weeks, then a week vacation June 4th.
https://www.youtube.com/watch?v=NE7RoHjlKbk
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v621/Hydrus.Network.621.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v621/Hydrus.Network.621.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v621/Hydrus.Network.621.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v621/Hydrus.Network.621.-.Linux.-.Executable.tar.zst I had an ok week. I fixed some bugs, did some duplicates work, and cleaned up some build and environment issues, and there's some more user work for ratings. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html ratings Thanks to a user, we have more rating options. First, under options->thumbnails and options->media viewer, you can finally set the size of ratings! Second, under services->manage services, you can now set many more 'star shapes: triangles pointing up/down/left/right; diamond; rhombus pointing right/left; pentagon; hexagon and small hexagon; six point star; eight point starburst; hourglass; large + and X shapes. In an experiment, you can also set custom svg star shapes. I've thrown a couple of very basic examples into a new install_dir/static/star_shapes directory, but you can add more yourself. Try to add something that's squarish and has a clean simple sillhouette on top of a transparent background. We're debuting some unusual new drawing tech here, and you may see some new misalignments or clipped borders at certain sizes. I'm going to keep working here to nail down good padding and margins, and we'll play around with svgs more to see about getting nice clean borders showing up. If it all works out, I expect we'll migrate all the existing hardcoded polygons to svg. We're also looking at unicode character icons too. duplicates stuff The duplicates filter now prefetches the next five pairs of images, for faster image rendering (it used to just do the next pair). You can now alter this number under options->speed and memory. I fixed an issue where flat colour images of the same num_pixels but differing resolution were counting as pixel duplicates. The duplicates auto-resolution system now lets you do 'system:known url' and 'system:number of urls' for 'test A or B', and 'system:number of urls' for 'test A against B using file info'.
build stuff I went on a 'cleaning out old libraries' kick this week, prompted by a user report about the PSD-reading library we were using (psd-tools). The psd library was pulling in a drawing library, aggdraw, which, in Linux at least, was including a very old and vulnerable version of a bz2 decoder. I don't know if this decoder ever actually ran for what we were doing, but I didn't like having this complicated library with possible security problems when all we use it for is grabbing resolution, icc profile, and a preview image. I hacked together a file parser and some other solutions we had lying around, including an ffmpeg renderer, and now the program no longer needs psd-tools. Some PSD rendering may be a little worse, but I also improved some transparency handling, so some files are better. Similarly, I removed the swfrender executables from the bin directory. These are a fun old tool to make flash thumbnails, but the project has been discontinued for a long time, and the exes are very old and we shouldn't be relying on them, especially for a crazy format like flash. New flash files you import will get the default 'flash' icon. For the future, the excellent Ruffle library is putting together a modern version of this render executable, so when that is ready, I'll investigate bringing this functionality back. On the other side of things, the AVIF library test last week went well, so I'm folding that in today. We should have better AVIF rendering. next week I want to integrate the 'A is an exact match of B' tech into the manual duplicate filter so we can test it with real world data.
I read mpv supports viewing images so I tested it. Not only does it view images, but it can view ".cbz" files too! For basic .cbz support you should be able to open it just like a video. All you need to do is pass this command to disable the automatic 5 second slideshow, or have users press spacebar after open. I even changed the file extension from .zip to .mp4 and it worked perfectly! example command >mpv 'file.zip' --image-display-duration=inf I think the the last thing you would need for manga support would be scraping tags from ehentai. This step requires manual website browsing, because manga sites have lots of duplicates. It would probably open a browser window inside hydrus, let you search, then have a button for "scrape tags" that imports them. The 2nd part is very complicated. 1st part should be easier to implement. You could auto scrape ehentai tags, just less reliable. It's probably easier to implement. >>22433 dev = god. Glad your IRL stuff went well enough
Now that this auto-resolver has gotten a decent amount of use from me, I'm realizing that many of the better dupes being kept by the auto-resolver are the ones that don't have the "nice" known urls like Pixiv and such. because of this, I found myself searching for a file from a certain domain, but not being able to find it because that one was actually deleted, and the dupe that was kept doesn't come from that domain. This is fine since the file itself is the one I prefer, but it makes searching with known urls way less reliable now. Because of that, I'd really like if you could add an option for the known-url search (maybe a checkbox?) for the search to count the known urls of any files in a duplicate group. By that I mean that if there's a (for example) pixiv file, and a gelbooru file, and they're dupes where I kept the gelbooru file and deleted the pixiv file, the known url search would still return the gelbooru file for searches for files with pixiv as a known url, since it looks at any files in the same duplicate group. I don't know if that sounds complex but I'm not sure if I'm explaining it well. Basically when the option is enabled, count the known urls of any file in a duplicate group when considering whether a file matches some known url search. or alternatively if you think this would be better or easier to implement, only count the known urls of duplicates when checking whether the king matches the search, and ignore for the others. It doesn't really matter to me, since all I want if for known-url searches to return the right files regardless of which duplicate was the one that was kept, without needing to resort to incorrect behavior like copying known-urls between dupes.
>>22445 To simplify, you're asking that deleted duplicates get metadata retained so you can still view their source URLs?
>double posting I think the Hydrus network software is an amazing foundation and has an amazing future for file management. I think it makes many uses of file managers obsolete. With Hydrus server it could possibly compete with ehentai's hentai@home software. That is peer to peer gallery distribution to reduce load on their servers. A lot like bittorrent, but with so many powerful metadata features built in. Current OSes cannot support metadata sorting naively. It can not only be made for personal use, but also reducing load on boorus and sharing "not copyrighted" works. It is a good way to collectively back up data if X website goes down. It's heavy shit, but also my fantastical delusion. It's the most interesting software I've found in a long time.
>>22461 deleted duplicates already have their metadata retained afaik. I'm asking for an option to have their urls be counted when doing "known-urls" searches on the non-deleted duplicates. just lumping together all the known-urls in the duplicate group for the search, so that a search for artwork from a certain domain will still work properly, regardless of duplicate deletions.
I've been going through my collection and shrinking file sizes with optimizers, and one thing I'd like to see is the ability to merge import and archive times with the duplicates processor.
>>21913 hi just drawing attention to this from last week
I noticed in the duplicate filter that Hydrus treats webps as being similar to jpegs when compared with PNG files, in that it treats the webp as being the obvious better dupe and says that the png is just wasting space and should be deleted. But doesn't Webp have a lossless mode? If that's true, then couldn't either one be the original file? or is Hydrus able to tell when the WebP is losslessly encoded and decide from that?
>>17718 >>18260 Found an interesting post from a developer of JpegXL that confirms lossless jxls are much slower to decode currently. For any CPU limited jxl users who don't need their images to be pixel perfect encoding instead as max quality lossy should get you much faster decode times. For both lossy and lossless you can also set the flag "--faster-decoding" to a range of 0 to 4, higher allowing faster decoding but with the trade-off of worse density or quality. Unsure if high effort loses these benefits. Also keep in mind effort levels aren't simply just attempting more passes on finding better compression, each one actually adds additional features to the process. https://github.com/libjxl/libjxl/blob/main/doc/encode_effort.md PR for lossless decoding improvements https://github.com/libjxl/libjxl/pull/4201 >Lossless is currently slower to decode than lossy by default, but me and a friend are working on that. We're overhauling the faster decoding setting, so you have the option of making the filesize up to 20% larger but the decode speed 50% faster than lossy too (over 3x faster than default). https://old.reddit.com/r/jpegxl/comments/1k4gq9l/decoding_speed_issues_with_macos_generally_and/moa0nqk/
>>22189 >QT_QPA_PLATFORM=xcb >QT_QUICK_BACKEND=software >QT_OPENGL=software >__GL_YIELD="USLEEP" I ran it with different combinations of them enabled (including all at once) and I didn't notice anything different at all. the crash occurred the same way, and Hydrus didn't seem to act any differently. an important thing I should mention though is that I already use QT_QPA_PLATFORM=xcb because MPV crashes without it, and I was told by another user that this would fix it and it did. though I've been doing that before the crash started happening for me when I updated to v614, so I doubt that's the cause, but it's probably important to know. I don't think it was a window manager update that caused this either, because it happened right after updating. I wish I had more info to give but I really have no clue what would be the problem. the only things I can think of also adding is that the actual MPV application itself (like when you open the file externally) zooms in just fine with no errors or crashes. also, If I zoom in just once when viewing the file in Hydrus, it won't crash, but there will be massive tearing at the top of the screen, and panning around will causes the view to freeze and jump around a lot. zooming in any more will crash Hydrus
>>21913 >>22470 Thanks, sorry, I missed this. This is a great idea and a clean fix to this ongoing problem. I will add an option for it in the 'delete lock' panel itself. >>22194 I'm glad you like it, and thank you for the support! When I think about hydrus, I can often only see the faults and bad design/code, but I am pleased when I need something weird and can just pluck it out of millions of files in ten seconds. I was recently dealing with a 'temporary' download folder overspill that had 1200 mid-length youtube vids in it. I wasn't getting through the queue fast enough and it grew over two or three years, and due to being lazy about a particular backup zone, I almost lost it when some hardware died. I managed to recover the drive the other day, and I realised 'oh, I wrote this whole software package to deal with a dumb queue like this, just import it all', and all that hassle and worry about managing this 200GB of stuff is now gone. Will I ever be able to finish the whole queue? Maybe not. But I am making progress still, and I am not worrying about managing the files. >>22219 >>22223 Yeah you got it. Basically it looks like this: - storage - the actual tags you edit - display - sibling and parent computed - single display - display tags with the 'single' filter applied - multiple display - display tags with the 'multiple' filter applied And there's a bit more fun where 'all known tags' is the union of all the tag services. I hate how convoluted it is, but it mostly works for the widgets we have. I wouldn't be surprised if it all gets reworked one day. 2) That's interesting the media viewer list is inheriting the menu and not showing the option to show single views. I'll check that. Regarding 3 and your general problem here, I am pretty sure the logic of how that experimental mode applies could be fucked up. There's some clever media update code that tries to optimise tag calculation and display for those lists, and I'm pretty sure that experimental thing just says 'ok, tag_display_type is now X, do an update call'. It might be fetching and recalcing based on cached filtered tag data correctly, but idk. I mostly added that mode for a guy who has a crazy tag setup that uses the single/multiple filters a lot, but I haven't touched it in a couple years I think. >>22355 You may be able to wangle this in future when I make the sidecar system work internally (which will let you convert URLs to tags and vice versa. My suspicion is this would be quite complicated to do and not worth the effort, but if you had a bunch that all followed the same pattern then maybe you could do some regex filtering and replace. >>22404 >>22417 We've experimented with this a bit, but I'm not sure if we ever made progress on a nice solution. The notes are usually JSON on the page, iirc, or some other javascript-like structure, so we can parse them to a note but they wouldn't be super human friendly to read. We thought maybe we could start parsing these and have some 'hide notes with this name' tech, and then one day roll out something that would actually render them in client, but it has always been something that is a lot of work for something that's two ticks on the side of niche. For real the answer here is probably to wait a couple years for some AI model executable that can do live translation notes on any arbitrary page and then pipe that into hydrus.
>>22441 Thanks, this is an interesting idea. I didn't know mpv could do that, but it is an always impressive program. This puts me in a slightly odd position, since my plans were to parse the contents of cbz files myself and then have some UI to display 'x/y' internal page navigation and system:num_pages stuff. If I simply let mpv handle it, that saves me having to implement much of that, but we'd lose some control over the whole thing. We'd also have to figure out sending 'next page' commands to libmpv, but that's probably doable without a lot of trouble. I could just try to get mpv displaying them as an interim step, which is probably the right thing to do, if simple. If it plugs into my scanbar as a page position browser, I'll add it immediately. So I think I'll have a play with this, but I can't promise I'll go for it, since I think I do eventually want to be doing this with my own calls. I want the media viewer to get some new 'navigate on a sub-carousel' tech for some things in file relationships, too. I don't know how well this works, since I don't use the site myself, but I think some people use this: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders/E-hentai I presume you'd want to use Hydrus Companion or cookies.txt to figure out their login system. >>22445 >>22461 >>22466 This is a very interesting idea. We've thought about many different ways to effect 'this duplicate group has this metadata', and things like retroactive content merge across duplicate groups, but simply expanding the normal file search to work over duplicate groups is a clever and perhaps great solution. I think this search code would be possible and not titanically difficult. I am not sure if I would add it to particular system predicates or if it would be better to have it somehow as a checkbox for the entire search. It may be simpler to perform an entire search over all possible members of duplicate groups and then filter the results to kings, but I'd have to think about it more. You are correct that deleted files retain all their metadata, and that includes duplicate relationships. I have generally thought of implementing a retroactive duplicate content merge 'big red button' where you say 're-apply content merge rules for all duplicate groups', so we can fill in various historical holes, and I am pretty sure we have all the data we need to make this happen. Can I ask if you have considered merging URLs in your duplicate merge options? Since you want to treat king files as if they have the URL in many cases, why not actually give them the URL? It isn't perfect, since the actual file at the destination is a resize or re-encode or whatever, but that's true any time CloudFlare 'optimises' a file in transit. >>22463 Don't ever look into the hydrus file repository, or your hopes will be completely dashed, hahaha! I originally planned hydrus to be a p2p network for imageboard-sized files, and in fact it was originally a p2p imageboard, but after the very first betas, I realised that the problem of managing files was far more pressing than that of sharing them. More than a decade later, I've made this crazy thing and there still aren't many ways to manage files. I have had some experiments into sharing files, like the hydrus local booru we used to have (it did simple html gallery shares), and the hydrus file repository (it works like shit and is severely under-developed, but a hydrus client can plug into a server that hosts some files), but I've never really gone full tilt into it. I hate networking code. I recently brushed up our IPFS support, and I think that could be a decent avenue to explore. You'd want a hydrus repository or some other method for neatly sharing sha256/IPFS multihash pairs, and I should re-add better IPFS downloading in-client, but that's a fairly decent p2p network that can run on top of hydrus, and with multihash lookup paired with the PTR you'd get crazy local search-and-download abilities. I think if we ever wanted/needed a way to share files, I'd prefer to simply have it work like this as a hardcoded plugin, or through the Client API, rather than dev it myself. We've seen a lot of tightened-up logins and CDN filters in the past year, particularly as AI has exploded and model scrapers are eating everything they can get at. I'm not sure what the future is, but I think it'd be a shame if pretty much every site ends up with Real ID verification to browse anything. For the time being though, I think it is still relatively easy to send files and tricky to hold them, so I'll keep working on weird stuff like duplicate resolution but keep in the back of my mind that maybe we'll want some new Client API calls or IPFS support to figure out whatever clever way the future generation of boorus want. But if we are talking black swans, I wonder if the future may not be oriented around hoarding/sharing files but instead something AI related, like mass local generation. Sometimes I go through old files in my hydrus collection, and I forgot I ever saw that file before. What's the difference between that situation and a high-quality AI-generated image based on my collection? Not a simple answer. I'm sure we'll still want to pin fixed and specific memories, like say vidya screenshots, but I wonder if the future of hydrus-style file management will be in larger part to manage a quantum cloud of ideas that we collapse using a model. We are getting thicker and thicker bandwidth pipes everywhere, but the human eyeballs remain this stubborn weak link in the chain, so perhaps what we'll care about in ten years is optimising the data entering that level, whatever that means. Same with human ears--no matter how many thousands of hours of music you can access, there's only so much you can listen to each day, so how do we optimise that experience?
>>22221 What will be done in a few days?
>>22468 Thanks, great idea! >>22482 Thanks--I don't think I'm being that clever, so I will revisit this and make it work better. I daresay that most people generally still want the webp in these situations since I bet the compression is better, and the png is probably still the child in the vast majority of situations due to Clipboard.png, but it isn't so clear cut. >>22493 Thank you--this is interesting news. Let's hope some related lossless decoding improvements percolate down to our JXL library in the nearish future and it suddenly works better one day. I don't think I'll try to wangle any 'faster decoding' tech into our pipeline--I've never been clever enough to do that 'multi-pass' grainy->smooth decoding you see on large jpegs sometimes in web browsers--but it is cool that JXL has that built in, and that it is so clever. I really do think they did a great job on this spec. If the lossless decoding remains super slow, I may add this as an option one day, but it would mess with stuff like pixel hashes unless I do a bunch of other work. >>22499 Damn. Since the geometry stuff can now run either way and still gives you an error, but it seems the error happens on a particular hydrus version, I wonder if it is related to some library update somewhere. Could be simply some other weird code change I made, but I can't easily explain your error being rare and it being some .so problem unless it is some weird specific version incompatibility somehow. Maybe I updated Qt around here. Tell me if we already tried this, but I don't see it, so can I verify you are running from source? I guess if we are seeing Python 3.12 errors, you are, but if not (or running from flatpak or something), please move to source, as here: https://hydrusnetwork.github.io/hydrus/running_from_source.html This fixes many OS-program errors since there is one less compatibility layer between python and the respective (and more native to your OS) .so file being called, and it gives us more power over your environment. Then I think our next step is to try building your venv with different versions of Qt and/or PyQt6 instead of PySide6. If you are new to running hydrus from source, try making the venv with the (s)imple version first, but then try (a)dvanced and choose some different Qt versions. Maybe the (t)est Qt version fixes some bug that your OS is having trouble with.
>>22514 The "faster decoding" tech is actually on the encoder, when encoding you can decide if you want a slight size trade off for much faster decode times on that specific image. In my experience with the nightly build of libjxl with the changes already implemented, using lossless at effort 10 (which is very fast to encode when faster decode is set to the max of 4) jxl still comfortably beats png in size, and Hydrus doesn't lock up the rest of the PC. Even though the jxl was encoded with an unreleased version of libjxl Hydrus is still able to read them and benefit from the faster decode times. What you're thinking of is progressive decoding, which requires a progressive encoded jxl. I think those are rarer and the encode/decode speeds are much slower for them compared to the one and done normal images. Unrelated but for anyone reading this, make sure if you're converting jpegs to jpegxl that you use the jpeg recompression option. It's a lossless conversion which saves a lot of space compared to making the encoder treat the jpeg as a bitmap.
any chance you could add an option to have the rating "bar" show the value that it's on to the left it, like it does in the tooltip? it's often hard to tell at a glance what rating I gave just going by the stars.
(881.46 KB 3638x2072 wd14 tag reader test.jpg)

>>22505 I set .cbz to default open with MPV and the only issues I see are it defaulting to 5 second slideshow mode, and it uses < and > instead of arrow keys. Should work. I'll test prototype builds if you want, any OS. >e-hentai downloader + companion extension It works, but doesn't save tags, or save into a .cbz. Close, but not what I can use. *shrug*. >Will I ever be able to finish the whole queue? This is why I think AI image/video tagging is the solution. Still a huge can of worms and there's already piles of worms everywhere. >sidecar to import URLs as tags I just manually edited download importers to add site:gelbooru or site:e621 as tags for each one. Works good enough for my needs. I don't see the point of going to the source booru once you have already scraped all the important info. >>22507 Funny a p2p network was your original intention. I think you're lost in the sauce of the project, and haven't realized how powerful Hydrus is at managing files. This is amazing technology that nobody has attempted before. Taken to its extreme Hydrus is a replacement for the world wide web in an organized, fault tolerant, decentralized manner. *takes meds* >real ID verification to get everywhere That seems to be where it's going :( Getting basic p2p functionality working is a good idea right now. Content is being locked down online. If you throw together shitty prototype builds of this stuff, I'll definitely spin up VMs to do testing for you. Obviously saying it is much easier than implementing it. >will the future be AI local mass generation? Haha I thought a similar question. I used WD14 image tagger to recreate an existing image, picrel. Other than missing styles, it's pretty accurate. I wouldn't worry about mass local AI gen eliminating anything. Your argument defeats its self. If mass local AI gen eliminates hydrus, then hydrus will be a stepping stone to advanced AI learning anyways. Very philosophical observation!
I'm going to make a 2nd Hydrus instance and try using it to manage movies. I have "few" enough files I won't mind manually tagging everything. I'll just plan out my own tag structure and not bother integrating parent/child tags. It should work great in theory.
i recently fixed a long-standing minor hardware problem that had been causing semi-regular crashes, several of which occurred while hydrus was probably writing to the database. my database has never shown issues and passed the integrity check when i ran it. is there anything else i should do to clean up and make sure there's no corruption?
Feeling very dumb here, I know Hydrus has a slideshow feature, but I cannot find the button. I've definitely used it before but I swear it isn't showing up. I'm on version 617 Windows by the way
>>22628 media viewer, right click the image.
This might be because of some shitty wangblows 10 update, because I swear this wasn't happening before and I haven't updated hydrus lately (v620). Every time I leave the media viewer it defocuses the Hydrus Client window and I have to click on it again to continue using the keyboard. When it defocuses, the taskbar pops up on my secondary monitor. Turning the monitor off doesn't stop the defocusing, but the task bar doesn't popup. If I drag the Client window to my secondary monitor, it doesn't defocus when entering and exiting the media viewer. Anyone have a clue what's happening and how to fix it?
>>22643 Went ahead and updated to 621. Same behavior. I think a Windows update is trying screw over my workflow.
>>22643 >>22644 Nevermind, it just stopped happening, and I have no clue why. Maybe something to do with repeatedly dragging the window between monitors, in and out of maximization, and jumping in and out of the media viewer.
I found an apng that won't load in the hydrus image viewer or mpv. Low priority, everything else works fine. (warning explicit degenerate content) https://e621.net/posts/5031914
(211.78 KB 1600x1009 710be.jpg)

>>22654 Seems to be a separate APNG issue, can't open it with native mpv or other programs like feh or gwenview, and GIMP also fails to recognize it as APNG, so Hydev sure won't help you here. Works fine in Firefox for some reason.
>>17530 Hello, this is >>17498 again. Had the issue happen again, unfortunately didn't have profile mode on at the time so can't give the full picture. But I had enabled logging requests and tried profile logging during the issue, so here's what I saw. From the main log, no errors. The requests were working with no issue until they simply, randomly stopped. v612, 2025-05-12 18:45:03: 45001 GET /get_files/file 200 in 2.07 milliseconds v612, 2025-05-12 18:45:06: 45001 GET /get_files/file 200 in 3.74 milliseconds v612, 2025-05-12 18:45:09: 45001 GET /get_files/file 200 in 39.8 milliseconds v612, 2025-05-12 18:45:12: 45001 GET /get_files/file 200 in 1.9 seconds v612, 2025-05-12 18:45:16: 45001 GET /get_files/file 200 in 1.1 seconds v612, 2025-05-12 18:45:17: 45001 GET /get_files/file 200 in 593 milliseconds v612, 2025-05-12 18:45:19: 45001 GET /get_files/file 200 in 43.2 milliseconds v612, 2025-05-12 18:45:30: 45001 GET /get_files/file 200 in 835 milliseconds v612, 2025-05-12 18:45:30: 45001 GET /get_files/file 200 in 71.9 milliseconds v612, 2025-05-12 18:45:37: 45001 GET /get_files/file 200 in 50.2 milliseconds v612, 2025-05-12 18:45:40: 45001 GET /get_files/file 200 in 39.9 milliseconds v612, 2025-05-12 18:45:46: 45001 GET /get_files/file 200 in 46.3 milliseconds v612, 2025-05-12 18:45:53: 45001 GET /get_files/file 200 in 42.6 milliseconds v612, 2025-05-13 09:57:40: Profile mode on! v612, 2025-05-13 09:58:53: Profiling done: 3 slow jobs, 4,474 fast jobs After the issue appeared I tried turning on profiling and making requests but nothing happened. The requests were still simply timing out. Even more, the profiling log file didn't even get created from simply the requests being sent to the api. It's not until I started to do actions in the gui that a "client profile" log file appeared in the db folder.
What do you do with useless tags added by downloaders/ai taggers? I considered just making them siblings with something like 'meta:useless' but they still sort by the original tag in the tags dialog.
I had an ok week. I fixed some bugs and integrated my new 'A and B are visual duplicates' algorithm into the duplicates filter for an advanced user test. The release should be as normal tomorrow.
>>22674 I've got a tag quality rating, so every so often I'll search my images for system:has no rating for Tag Quality, and manually prune bad tags and move the good ones over to the 'my tags' domain.
>>22674 >they still sort by the original tag in the tags dialog you can change that behavior. I have it so that they sort by the "visual" tag to make them easy to find. I thought that was the default.
>>22674 There's the tag migration system under Tags -> migrate tags... which can be used to delete tags.
>>22193 >I'm going to convert all the requirements.txts over to use it when I pull myself together. Sorry, I meant to reply before; I believe you already have this though, as I converted every `requirements.txt` file in the pyproject.toml I sent you, and I think you kept them updated (you moved things around in v612). If you `uv run hydrus_client.py`, you run the app with the `default-groups`, but for example I use `uv run --group qt6-new-pyqt6 --no-group qt6-new hydrus_client.py`, which replaces `setup_venv.sh` entirely.
https://www.youtube.com/watch?v=fDzXbdxeeHI
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v622/Hydrus.Network.622.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v622/Hydrus.Network.622.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v622/Hydrus.Network.622.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v622/Hydrus.Network.622.-.Linux.-.Executable.tar.zst I had an ok week. There's some bug fixes and new duplicate tech for advanced users to test. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights The recent resizable ratings changes broke on a particular version of Qt (PyQt6). If you were caught by this, and perhaps even couldn't boot, sorry for the trouble! I do test PyQt6 every week, but this somehow slipped through the cracks. Ctrl+Shift+O will now launch the options dialog. If your menubar gets messed up because of a setting, this is the new fallback. You can now paste multiline content into the text input of a 'write/edit' tag autocomplete that has a paste button (e.g. in 'manage tags'), and it'll recognise that and ask if you want to essentially hit the paste button instead and enter those lines as separate tags. If you would do this a lot, a new checkbox in optons->tag editing let's you skip the confirmation. I improved some PNG colour correction this week. I think it will make about one in twenty PNGs a small shade brighter or darker--not usually enough to notice unless you are in the duplicates system. If you notice any of your PNGs are suddenly crazy bright/dark/wrong, let me know! A couple of new checkboxes in options->files and trash let you, if you have the archived file delete lock on, pre-re-inbox archived files that are due to be deleted in the archive/delete or duplicate filters. I don't really like writing exceptions for the delete lock, but let's try this method out. duplicate test If you are in advanced mode, the manual duplicate filter will have a new '(not) visual duplicates' comparison line in the right hover, along with some mathematical numbers. This is the 'A and B are visual duplicates' tech I have been working on that tries to differentiate resizes/re-encodes from files with significant differences. I have tuned my algorithm using about a hundred real pairs, and I'd now like users to test it on more IRL examples. It sometimes gives a false negative (saying files are not visual duplicates, despite them being so), which I am ok with from time to time. What I am really concerned about is a false positive (it saying they are visual duplicates, despite there being a recolour or watermark or other significant change). So, if you do some duplicate filtering work this week, please keep an eye on this line. If it predicts something wrong, I would be interested in being sent that pair so I can test more on my end. Feel free to look at the numbers too, but they are mostly for debug. Assuming this goes well, I will re-tune this detector, polish its presentation, and enable it for all users, and I will add it as a comparison tool in duplicates auto-resolution. next week I think I will keep chipping away at my duplicates auto-resolution todo.
>>22704 >I improved some PNG colour correction this week. I think it will make about one in twenty PNGs a small shade brighter or darker--not usually enough to notice unless you are in the duplicates system. If you notice any of your PNGs are suddenly crazy bright/dark/wrong, let me know! Hey, I'm the anon that originally reported this issue. Thanks for the quick turn around! I think the original case is mostly resolved, the png and jxl now look identical to me in hydrus. However hydrus still doesn't report the two as pixel-perfect duplicates. I did run them through imagemagick's compare command (a hopefully more accurate test than randomly poking them with GIMP's color picker), and imagemagick says there's an "Absolute Error" of 0 (0), which I believe means theres zero pixels that are different. I'm not sure if this is related to the original issue, or a separate one though. Unfortunately in fixing the first issue I think a new one's come up. I've got a PNG (attached, mildly NSFW) that's wrong (according to GIMP's color picker). I think this one's might by caused by hydrus not checking for an ICC profile if there's a cHRM chunk, given this image has both, but hydrus doesn't report the ICC profile in the duplicates processor.
>>22704 >You can now paste multiline content into the text input of a 'write/edit' tag autocomplete that has a paste button (e.g. in 'manage tags'), and it'll recognise that and ask if you want to essentially hit the paste button instead and enter those lines as separate tags. If you would do this a lot, a new checkbox in optons->tag editing let's you skip the confirmation. I was really hoping when I read this that there would be a auto no option, but the only option is auto yes. I guess this is sort of getting into xkcd "how dare you, I was using your program as a heater by holding down the spacebar" territory, but when I add comics that lack a title that were posted to social media, I usually just copy a large chunk of the social media post, which often contains new lines. thanks as always, though!
>>22712 Quick follow-up, poked around the png spec a bit, turns out there's an order to which color correction chunks take precedence. That is: 1. cICP 2. iCCP 3. sRGB 4. cHRM and gAMA You should use the first one in that order you support, and ignore the rest.
>>22674 I just hide them under tags > manage tag display and search.
>magadex hit by massive DMCA storm, over 700 issues removed or about 25% of their catalog Noting for media preservation reasons.
>>22704 >What I am really concerned about is a false positive (it saying they are visual duplicates, despite there being a recolour or watermark or other significant change). So, if you do some duplicate filtering work this week, please keep an eye on this line. If it predicts something wrong, I would be interested in being sent that pair so I can test more on my end. Not sure if these are different enough to be what you're looking for here, but these two images show up as visual duplicates (nsfw): [spoiler] https://rule34.xxx/index.php?page=post&s=view&id=13088335 https://rule34.xxx/index.php?page=post&s=view&id=13339046 [/spoiler] The difference is that the first one still had the sketch layer enabled. You can see it if you look at the bottom right of the image.
Is there a faster way to figure out what % of files with X tag have Y tag than setting system limit: to [really big number], searching X tag, checking the autocomplete for Y tag, and then doing math?
>>22704 Do you want examples of false negatives or just false positives?
Hey hydev, I can't tell if this is fixed by looking at v622's changelog so I'll just report it for v621. I've added some of the other auto-resolution rules and made my own, so I have 3, the jpeg/png comparison, a jxl/png comparison and the pixel-perfect EXIF/ICC data one. The pixel-perfect - EXIF or ICC is currently working on automatic going through about 13k dupes, but the number is not going down on the overview menu, it is working in the edit menu and it's definitely doing db work, so it's just a minor display bug. I've tried the regen cached numbers task already.
>>22507 >We've thought about many different ways to effect 'this duplicate group has this metadata', and things like retroactive content merge across duplicate groups, but simply expanding the normal file search to work over duplicate groups is a clever and perhaps great solution. Wow that's actually brilliant! I was narrowly thinking about known urls, but actually this could be extended to all metadata like tags. that would be really cool, and I think you're right that it could even be able to act as an alternative to "hard" merging entirely if it could be something that a user could have on by default for searching. This would have the benefit that it'll always have up to date data from any dupes that might get more metadata from a new url or something like that. it seems like an elegant solution. >I am not sure if I would add it to particular system predicates or if it would be better to have it somehow as a checkbox for the entire search. It may be simpler to perform an entire search over all possible members of duplicate groups and then filter the results to kings, but I'd have to think about it more. If having it be attached to an entire search would allow you to make it faster, then that would probably be the best way to do it. >Can I ask if you have considered merging URLs in your duplicate merge options? I have and in fact, this is what I used to do, but it caused many problem for me. it caused downloaders to break really bad sometimes, and it took me a while to figure out that that was the cause. Also, at one point caused so confusion for me when I was doing some maintenance work related to urls that I thought I was having database corruption. so I eventually stopped merging urls, and I don't do it at all anymore. merging known urls also made using them for the duplicate filter completely worthless. this is something that's very helpful to me, so not being able to know where a file actually comes from in order to help with duplicate filtering and finding sources is too big a downside for me to have url merging. When it's "virtual" though, like in the case of this idea here, it's the best of both worlds. The actual known-urls stay what they truly are, but you can search as if they're merged in the ordinary case where it's not as important. (I know you're busy with the auto-resolution and duplicate filter stuff currently, but if you decide to try out this "virtual metadata merge" idea sometime in the near future, I'd definitely be willing to beta test it.)
>>22803 Update, even with all the pairs resolved, all tags uploaded to the PTR, and another regen, it still happens. Maybe it has something to do with the files being in the trash?
>>22551 Ah, thanks. I was talking to someone else about what you posted and realised I had read your URLs' details wrong in several ways (many such cases). I thought this was some sort of choice on decode complexity--sorry for talking nonsense, hahaha. Ultimately, if JpegXL is getting active work, which it definitely seems to be, I'm hopeful we can see improvements in render speed one way or another. >>22553 Good idea. We are still thinking about making these render prettier, so I'll make a note to play with this. >>22622 If it passes an integrity check I think you are probably great. That's SQLite saying the files are completely undamaged. Sometimes a crash during a heavy commit can cause the database to become unsynced in a couple of tiny software-level logical ways (let's say the client.master.db thinks a tag exists, but the client.caches.db never got that update), but these are fixable. If you want to read more background info and haven't seen it, the master document for this stuff when it does happen is 'install_dir/db/help my db is broke.txt'. You might want to run chkdsk or crystaldiskinfo as a general disk health check. >>22643 >>22644 >>22645 Let me know if you figure any of this stuff out. Win 11 does this to me all the time for other windows, where I click explorer but it stays on the bottom until I click elsewhere to reset some dumb 'last clicked' z-index variable somewhere, and it drives me nuts. Actually, I think for a while I had some tech that tried to force a 'refocus main gui' on a media viewer close event. I can simply add this back in and wrap it in an options checkbox. I'll try and do this for v623--please play with it and let me know if it fixes your issue. >>22654 Thank you! I agree that this is probably just a busted file, but it is always useful to have these around.
What is a Hydrus file repository? Does it work like a private PTR but for actual files? Also I think there's a bug in the shortcut menu, I wanted to set my rating namespace keybind to also change a numerical content rating I have but never remember to use, but the summary and the actual command when I try to edit it are different, however other commands have their summaries correct, did the command get moved around? The shift+nothing keybind is from when I take a screenshot, ignore that.
(277.03 KB 632x492 python_boZm8gPtd4.mp4)

>>22668 Thank you, this is interesting. If the profiling mode didn't catch these, then I expect the actual hydrus code in the API network request was working very fast, and so it was assigned to the 'fast job' pile. When I do a file request, I check the request params are all good, find the file, and then I say to twisted, our networking engine, 'hey, here is a file request, please stream this to the user' and hand off control. The twisted mainloop, which is some clever async stackless event loop thing, then handles it. If it is twisted that is getting lagged out here, I wonder if all of these requests are stacking up and consuming some 'max number of streams' resource, and my close handler isn't clearing/resetting that resource properly. I will investigate this. >>22688 Yeah exactly, what I meant is that I still maintain both the pyproject file and the requirements.txts separately in my own personal project workflow. Any time I change a library version, I update all locations. I want to move to a workflow that uses the pyproject as the single 'master' that I edit and then I use some script to auto-generate child requirements.txts as needed from that, but I just haven't got around to it yet. >>22712 >>22719 Thanks, I appreciate you doing the imagemagick thing. I figured my math was off in one or two pixels, but perhaps it really is pixel-perfect and I'm just generating the pixel hash without the gamma correction somehow. I will reinvestigate this and try to nail it down. I also want this data highlighted in UI somehow, like ICC Profiles. I saw the same list of 'this is the order to apply png corrections', and I tried to make the gamma stuff secondary to ICC Profile, but I guess this is failing somehow. Or maybe that file's ICC Profile is weird. I'll revisit it. I'm open to scanning for and potentially writing a fix for sRGB data too in the same way, once we're happy with this step and assuming it isn't one-in-a-million. >>22757 Thank you for this report! Unfortunately I do not get the same--mine says 'not visual duplicates'. Vid related. Are you sure the URLs/files are all lined up with your example? Hashes I got from 'original image' downloads from those Post URLs are: 6ebeba75f560943e9b7bf2715f212ad9c3fb2c2765da60ba638d0139c7f24b12 d1398e3cd92604fc6f4c5dc4eb877f1a8d67c508f735755eced209dc3df4a7ee
>>22782 I think that's probably the most simple way to do it, if you have only like five to look up. If you want to compute stats for hundreds of tags, you'd want to access the database directly. This is totally doable, but I'd have to walk you through what tables to go over. If you are familiar with SQLite and want to poke around yourself, you'd be looking at a 'display' mappings table in client.caches.db cross referenced to subtags/namespaces/tags in client.master.db. Let me know if you'd like a full walkthrough. >>22796 I've had a bunch of false negative submissions now, and they are all generally in the genre of 'high & low quality jpeg encode' or 'blurry small resize vs original image'. I don't need any more like that, and I'm also ok with those being a false negative. I can't make a good solution in this greyzone, so I'll probably update my texts to have finer graduations of 'it almost passed the test, but probably not visual duplicates'. I'd love any false positives. Haven't had any but >>22757, and I could not reproduce it. >>22803 >>22832 Damn, thank you. I've seen something similar to this recently and couldn't reproduce it on my dev machine. I guess this list still isn't updating properly sometimes. I'll give it another look. >>22841 Yeah, a file repo is basically just a server that will hold files and give thumbnails and some file info to a syncing client. You can run a search on a file repo you are synced to just like you can on 'my files' or 'trash', with tag autocomplete and everything, and then you basically double-click a thumbnail and it downloads. It kind of works, but hardly anyone uses it and I never put much time into it. It has a buggy workflow and awful admin panels, and I'll probably retire it sometime, particularly when we figure out nicer client-to-client API comms. If you are not an experienced hydrus user, I'll recommend you say stay far away, but if you know what you are doing and are comfortable trying out the server (or already do with a tag repo), feel free to play around with it. For your rating shortcut bug, thank you for this report. The shortcut system doesn't support one-action->multiple-commands yet, which I should highlight in UI better. As to why your panel is initialising with the (default) archive file command and failing to initialise with the actual command it launches with, maybe that's a related bug, maybe not; I'll have a look at it.
>>22829 Thanks for your perspective. Yeah, I think this is worth playing around with, and there's probably some neat bit of logic I can add to the core file search that just says 'expand search domain to all grouped files' and then 'collapse all results to kings' or something, either wrapping the whole search or every sub-predicate I test. I'll make a note and have a think about it all when auto-resolution is clear. These questions are only going to come up more with auto-dupes happening.
(18.36 KB 344x59 17-20:28:18.png)

>>22844 If it helps, I added another autoresolver for pixel dupes without ICC/EXIF info, and the only thing that shows up as completed is the one test pair I did manually. They were all semi-auto at the start, but I switched them to full-auto, maybe it has something to do with switching the operation type while they work? Alternatively it may be something with the work hard button, as I want the work done quickly so I've always been working hard, maybe a work hard on a prior task that finishes after the newer task gets added locks it up? I'm just spitballing as a dabbling programmer who can't read Python. I just tried restarting and it did update the progress indicators properly.
>>22844 I've got false negatives that don't fall into those categories, actually! They're same resolution, same file type, no jpg quality difference if they're jpgs, and completely indistinguishable to my eyes even if I zoom in. https://files.catbox.moe/fbukpb.7z I'll keep my eyes out for false positives!
(532.07 KB 2246x1640 lewdstuff.jpg)

(2.15 MB 2113x1543 lewdstuff.png)

>>22842 >>22842 >Thank you for this report! Unfortunately I do not get the same--mine says 'not visual duplicates'. Vid related. Are you sure the URLs/files are all lined up with your example? Ah, my bad. Some url metadata got mixed together due to the source url on different boorus being the same despite the files being different. The correct urls are: https://e621.net/posts/5490251 https://rule34.xxx/index.php?page=post&s=view&id=13339046 Hashes: sha256:d1398e3cd92604fc6f4c5dc4eb877f1a8d67c508f735755eced209dc3df4a7ee sha256:dd9f2ad2e45f9395b570e425038353ae7e36f5516bd985f51918b768e204d5c7 and I'll attach them as well (assuming 8chan doesn't do any file optimization).
(6.62 KB 573x135 future!.PNG)

I think when hydrus tries to determine a web domain time from a direct file link, it doesn't take into account time zones? I don't know how else this could happen, where the web domain time is in the future. There was no parser involved here since it's a direct file link.
>>22869 Accidental spoiler...
(6.76 KB 512x155 ponerpics.png)

I tried to make another downloader.
I'm guessing it's not a good idea to use the same database on both linux and windows, right? I'm dual booting linux and ended up using it more than windows, but my hydrus install is still on windows, so I can't access my DB...
>>22879 it should be completely fine. I used to run the Windows version of Hydrus in Wine, but now I run it natively on Linux. I never needed to change anything about the db.
>>22882 You are right, it worked, just had a few issues with mounting my drives properly... but now it works.
>>22868 Thanks, perfect. Very interesting and slightly difficult problem of a 'soft' alternate spread over a larger region of the image. It evaded my basic skewness test since the difference wasn't localised, and the difference is subtle enough my normal stats didn't see it. I played around a bunch with this today, trying a load of bullshit distribution modality testing and stuff, but there isn't an excellent solution with my current tech. Ultimately I think my tile-comparison scorer needs to be smarter, and I have some ideas for this that I can play around with in future iterations of the system. I've written a smarter hook to catch this particular guy because there is one spot in its distribution that sticks out, but I don't think it will work for other pairs in this class. To better handle grey cases, I've categorised my positive results here into 'very probably', 'definitely', and 'perfectly' confidence intervals. I'll allow the user to choose between these in auto-resolution and will default to definite/perfect only and not go anywhere near the grey area for now. One step at a time from pixel-perfect, and keep iterating, is fine. Any more false positives you run into would be great! >>22862 Great, thank you. I have tuned my skewness test to recognise these better. They are now recognised as 'definitely visual duplicates'. >>22846 Thanks, fixed it today. Auto-resolution rules have some funny properties and the list wasn't updating them correct, still. I pinned down the problem and it should be sorted--let me know if you have any more trouble.
(63.19 KB 1147x702 Capture1.PNG)

(9.75 KB 640x166 Capture2.PNG)

(6.45 KB 635x118 Capture3.PNG)

v622, running from source, python 3.11, windows 10. I noticed that certain images I downloaded from a set were taking a really long time to load in the media viewer. I managed to capture them with profiling mode, and it appears to be related to gamma/chromaticity. (first pic related) Then I realized that when these images were taking so long to load, they were also using tons of CPU and memory. (second pic related) Also, they sometimes cause hydrus to just completely close itself silently. Nothing in the log. It's inconsistent whether it causes a crash. If it doesn't cause a crash, then sometimes even after closing the image and exiting hydrus, python will continue to exist in task manager, using tons of CPU and memory. (third pic related) This doesn't always happen, though. Here's a few of the images that this happens for. Not safe for work. https://files.catbox.moe/6rqrl0.png https://files.catbox.moe/qy8mdv.png https://files.catbox.moe/6c81by.png
I found a false positive: https://danbooru.donmai.us/posts/9239441 and it's child are reported as "visual duplicates" despite the mosaic censoring on the parent. Probably related to the child being crunched to hell by bluesky. Aside from that the only issue I've noticed is the info occasionally gets cut off at the bottom on the duplicates processor right panel.
>>22869 Thank you for this report--yeah, in a couple places in the parser and the network engine, I often don't try to pull timezone. They are so often out of whack, where either the server tries to give you a datestring in UTC, or the timezone you reported, or the server timezone, or the web engine's 'set' timezone, or the timezone data is in a weird format, that I mostly just fall back to trying to get the simple answer and assuming some sort of UTC. There may be a nicer solution to this with the neat dateparse library we use these days, and I'll schedule a job to check that out, but in general I don't think there is a good and reliable solution here. Ultimately I'm not too worried since after a year, that +/- 12 hour imprecision isn't a big deal. >>22901 Thank you, damn, I will check these out. That's my new 'gamma' correction code, I guess it falls over on larger files. I'll see what I can do. >>22904 Thank you, very useful! Thankfully the hook I wrote for the pair above also applies to these (basically the mosaic stands out as a subtle skew bump that I can sometimes detect), so this pair now reports as 'probably not visual duplicates' for v623. I'll hang on to these, and when I revisit my basic scoring system that undergirds this algorithm, I'll see if I can figure out a spike for 'yes, that is a mosaic, not jpeg artifacts'.
>>22441 >>22507 I've done this for a long time, but mpv as a comic reader/image viewer is just missing too many features. It works, but the page flipping, zooming, scrolling, are not comfortable. Long-strip comics are a real pain. Aren't there other ereader programs you can subsume, hook into, or hack? Zathura?
>>22712 Follow-up here: I tested the jxl and png in my 'A and B are visual duplicates' detector, and there were very minor differences in some of the tiles. It was extremely extremely similar, but my guess is my hacky gamma/chromaticity solution has some float rounding errors so it isn't perfectly pixel exact compared to the JXL and its ICC Profile output. It may be possible to correct my math, but I had a look and a tweak here and there and couldn't figure it out. I also noticed that when I loaded the image in Qt, the ICC Profile it generated, when applied, also seemed to generate a different pixel hash to the JXL. So, it may be with ICC Profiles and white points and all that stuff that we can't expect pixel hashes to always collide. I'm not sure. For the new image, I'm afraid I get the same colour picker values whether I load the file in hydrus, qView, or GIMP 3.0. However this file has a lot of underlying bumps as far as I can see--the 'flat' areas actually aren't. The skin in the highlighted area is generally fcf2ea in hydrus (top) and qView (right). It isn't so reliable in GIMP, which is not applying I guess a smoothing upscaling filter for the zoom (see vid for relatively bumpy selection), but it is there in places. Also, the good(?) news is this file is not being run through my gamma/chromaticity stuff. My loader is recognising it has an ICC Profile and just applying that, as it is supposed to do. Maybe at a strict 100% zoom and choosing the exact same pixel we'd see a difference, but I can't see obviously that hydrus is shifting the colours in one direction--can you say more what you saw with gimp colour-picking? Could it just have been accidental since GIMP is picking from the secretly bumpy colours in the 'flat' areas?
(21.17 KB 607x423 2025-05-14_22-58.png)

>>22928 What I noticed is the dark areas of the dress are significantly darker than they should be. In gimp (ver. 3.0.4) I get #1C1C23 from a screenshot of hydrus and #222228 from the original PNG, picked just to the left of the top crease line in the attached screenshot from hydrus. Sorry about the lack of details in my initial report. here's a JXL for reference: https://files.catbox.moe/50kvmk.jxl
I had an ok week. I fixed and cleaned some things, improved some recent png memory spikes, and polished and added the 'A and B are visual duplicates' test to manual duplicate filtering for all users. The release should be as normal tomorrow.
>>22936 I've been watching the visual duplicates line and I have yet to find any false-positives myself, but if this means that you're adding an option to filter specifically on files marked as visual duplicates like you can with pixel for pixel duplicates, that'll make looking for false-positives easier. glad to see it!
https://www.youtube.com/watch?v=aigNM1MR_x0
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v623/Hydrus.Network.623.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v623/Hydrus.Network.623.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v623/Hydrus.Network.623.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v623/Hydrus.Network.623.-.Linux.-.Executable.tar.zst I had an ok week. I fixed some bugs and the 'A and B are visual duplicates' test is available for everyone in the duplicate filter. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights A user has submitted a significant new darkmode stylesheet, 'Paper_Dark', which you can find under options->styles. It has assets and will change the appearance of common widgets like checkboxes. I recently improved how some PNGs render their colours. The process I wrote was eating a lot of memory on large files, which I fix today. It still runs a little slow, but I'm close to figuring out a nicer solution. The 'A and B are visual duplicates' test last week went well. Thank you to those who found and submitted pairs it was failing at. I have improved the algorithm's precision and stated confidence, and it is now turned on for all users. In the duplicate filter, you will now see statements like 'definitely visual duplicates'. You can generally trust them, but if you find a pair that is completely different to what it predicts, particularly if it is highly confident they are visual duplicates when they are not, I'd love to see them! future build Only for advanced users! I am making another future build this week. This is a special build with new libraries that I would like advanced users (and this week particularly Win 10 users) to test out so I know they are safe to fold into the normal release. More info in the post here: https://github.com/hydrusnetwork/hydrus/releases/tag/v623-future-01 next week I would like to get 'A and B are visual duplicates' working in the duplicates auto-resolution system. This test needs extra CPU, so I'm going to have to adjust some preview UI to handle the per-pair delay better. I feel generally good about it, and I hope we'll be able to auto-resolve a lot of non-pixel-perfect dupes very soon. Just as a general note, I'm taking my summer vacation week on June 4th, so two more releases before then. >>22955 Unfortunately not, just in the live comparison statements for each pair. The data is expensive to calculate and caching it in the database would be trickier than something like pixel-perfect similarity. My hope is we'll integrate it into the auto-resolution system and all the easy examples will be automatically removed from the queue, thus making the duplicate filter naturally populated with more interesting and different results, and that'll be useful on its own. We'll see how it all goes.
(1.46 MB 1076x1076 58aebad99a.png)

>>22963 >video
>>22963 >The data is expensive to calculate and caching it in the database would be trickier than something like pixel-perfect similarity Oh okay, that's a shame. Well then in that case, would you mind adding a "filter" for the semi-automatic mode of auto resolution? It'd be something where you look through each pair of files for the rule, then press a key for "approve" and "deny" like with "archive" and "delete" in the archive/delete filter, then it just takes you to the next one down the list. Unless I'm missing something, it looks like you currently have to click on each pair, then inspect them in the window that pops up, then you have to exit that window and then click on the approve or deny button. a filter would make this process much faster, so I could focus more on inspecting the pairs to make sure that the rule is good.
>>22976 >>22963 Actually now that I give it another look, I just noticed that these pair pages don't have the comparison statement boxes either. those would also be useful for semi automatic mode since that would make it easier to quickly spot if I made a mistake when creating the rule. though that's probably less important than having a filter.
I just got myself a laptop for traveling and stuff. I'm wondering how I'm going to sync my two hydrus clients - right now I'm copying my backup from my main pc to install it on my laptop, but it takes around 1:30 to copy all that stuff. Certainly there's an easier way to do this, right? Maybe I could leave my main pc running as a server and use my laptop to access files? Although sometimes I would be saving new stuff or changing tags, so I'm not sure how I'd do that.
>>22983 Look at hydrus-web, it can upload files and change tags from your web browser. You can use a vpn to remotely access your PC.
>>22984 >>22983 I should mention, you can use it through hydrus.app, I personally run my own instance through docker but it should be no different.
>>22984 >>22986 I don't quite understand how hydrus-web works but I found the github, I'll go read it. But it seems like it works fine so I'll learn that one. Thanks!
>>22989 (me) >>22986 >>22984 Oh and by the way I'm using a VPN already to pretend I'm on LAN for other stuff so I can probably set up a safe hydrus environment like that. Yay!
How do I filter out files that have both tags A and B, but not files with just either? Sorry, am smoothbrain
>>22991 Click "OR", add both tags as excluded.
Hydrus can't render this: https://e621.net/posts/5589538 DamagedOrUnusualFileException: Unable to render that video! Please send it to hydrus dev so he can look at it! ================ Traceback ================ Traceback (most recent call last): File "/[...]/hydrus-622/hydrus/client/ClientRendering.py", line 1025, in THREADRender numpy_image = renderer.read_frame() ^^^^^^^^^^^^^^^^^^^^^ File "/[...]/hydrus-622/hydrus/core/files/HydrusVideoHandling.py", line 1019, in read_frame raise HydrusExceptions.DamagedOrUnusualFileException( 'Unable to render that video! Please send it to hydrus dev so he can look at it!' ) hydrus.core.HydrusExceptions.DamagedOrUnusualFileException: Unable to render that video! Please send it to hydrus dev so he can look at it!
>>22993 The file has a .png extension but Gwenview image viewer says it is an .apng file.
(167.13 KB 2067x1602 1464294992699.jpg)

(167.13 KB 2067x1602 1469390057974.jpg)

(125.03 KB 1639x922 1433591667672616967_1.jpg)


(12.97 KB 750x528 123, 1231, d.png)

I've run all the default auto-resolution rules to completion, but still have about 10k pixel dupes left over. From a cursory glance many are the exact same size, or they have EXIF/ICC data on both, usually one has EXIF&ICC whereas the other only has one of EXIF or ICC. For the same size files I think it's due to them being byte for byte the same size, so they don't actually pass the A<B rule, or they both have some EXIF/ICC data, so neither ends up as the king. I think =< or => would fix the A<B aspect, but looking around... did greater than/less than equals to get removed? I don't see it in the system predicates anymore. Picrel are some example pairs. No, I have no clue how I have so many pixel dupes that are the same size but not the same hash.
>>23003 Oh, the image limit is 5, I thought it was 6.
Hydev, can you please elaborate on this statement from your changelog? >the 'this is a pixel-for-pixel duplicate png!' comparison statement and associated high score no longer applies to webp/png pairs. webp can be lossless, where this decision is less clear. if and when we get a nice 'this webp is lossy/lossless' detector, I'll bring this back Why wouldn't a pixel duplicate webp be better than the png? I'm wondering if I should wait on setting up an automatic webp-png pixel dupe process or not.
>>23005 depends on what you value. I personally set up a rule to just always take the smallest on pixel for pixel dupes regardless of anything else since that's how i was handling those manually the reason the pixel dupe of jpg rule exists is because logically the jpg has to be the original in those cases (or at least one step of conversion older) since if it was the other way around they wouldn't be pixel dupes. (this also likely saves space since jpgs are smaller than pngs in nearly all cases) since a webp can be lossless, there's no indication which is closer to being the original file.
>>22842 >I want to move to a workflow that uses the pyproject as the single 'master' that I edit and then I use some script to auto-generate child requirements.txts as needed Ah, sorry, I misunderstood! If you'd like, uv has an option for that: uv pip compile. Keep in mind that it takes some liberties from what it resolves (for example, beautifulsoup4>=4.0.0 becomes beautifulsoup4==4.13.4 because that's what it will end up using), but it's a pretty complete solution. For reference, `uv pip compile pyproject.toml --universal --no-deps --no-annotate --no-header -o requirements-uv.txt` will spit you a file similar to `requirements.txt`, with the `sys_platform`-specific stuff (that's `--universal`) `uv pip compile --universal --no-deps --no-annotate --no-header --group qt6-new-pyqt6 -o static/requirements/advanced/requirements_qt6_new_pyqt6-uv.txt` will just get the contents of that file from the qt6-new-pyqt6 group that's in pyproject.toml. If you remove the `-o` option, it'll show you on stdout, easy to test like that. You could combine groups, like `uv pip compile --universal --no-deps --no-annotate --no-header pyproject.toml --group qt6-new-pyqt6 --group mpv-new --group opencv-new --group other-normal`, but that's kind of advanced, and doesn't fit the current setup you have with the `static/requirements` files you have. Bottom line: you could loop the groups + the main file and it'll reproduce (more or less) the files you currently have. I hope you try it!
>>22932 Aha, can you please check the 'eye' icon in your media viewer and ensure that 'apply ICC Profile colour adjustments' is on? Maybe in some testing here, that got flipped off for you? In my testing, when I have it on, the ICC Profile is correctly applied and I get #222228, but when it is off I get #1C1C23. Pic related is, starting top-right and going clockwise: JXL Hydrus PNG Hydrus (ICC on) qView GIMP Cut-out of PNG Hydrus (ICC off) They all say #222228 but for the ICC off, so that's what I guess you are seeing. If it is this ICC Profile setting, that brings up what that setting should be doing. It is basically a debug holdover from when I was originally implementing this tech, but we probably want it always on and just temporarily disable-able in viewer alone, rather than a global setting that will change pixel hashes and thumbnails and so on. >>22976 >>22978 Thanks, yeah, I agree, I think we are going to figure out some way of loading a semi-automatic rule in the dupe filter proper. As you suggest, it makes sense just to natively integrate it into the current pending/approve thumb pair list. >>22993 >>22997 Thank you! Yep, looks like a busted apng. It is an unusual format, and web browsers tend to have better support for weird flags or truncation than ffmpeg or mpv. I'll hang on to this as a useful example. Maybe future versions of ffmpeg/mpv will suddenly handle this ok. >>23003 >>23004 Thanks, I'll add >= and <= to my main NumberTest thing that backs all this. Your Mercy files have the same hash for me, so I am guessing 8chan stripped out some metadata, but the robot pictures look exactly the same except for two bytes near the top, in an ICC Profile. One says the ICC Profile is v4.3, the other that it is v2.1. I've been doing ICC Profile header stuff recently so I actually know this, ha ha. I guess some program or CDN sucked this file in and 'optimised' it and updated the ICC Profile version bytes to what it actually is or what latest value it would rather it be. The png and gif are completely different internally and just slightly different sizes.
>>23005 Sure: as per >>22482 , the logic of 'for normal images, a pixel perfect png is always a copy of the original lossy (jpeg)' only extends to webps if they are also lossy. I had forgotten that webps have their own lossless encoding mode. If a lossless webp and a (lossless) png are pixel-perfect duplicates, we don't know which is the original. Hydrus isn't clever enough to know a lossy webp from a lossless one, so it isn't appropriate to apply the confident 'this is a pixel perfect png duplicate that is a waste of space' statements to webp/png pixel-perfect pairs at the moment. If and when I do get the ability to know if a webp is lossless, I'll be able to say that statement for lossy webps but not lossless. It isn't a big deal though! If we are talking pixel-perfect dupes, it isn't like any visual data is being lost. If you want to save space, only refrain from setting up your own auto-resolution rule if you care specifically about detailed metadata preservation. >>23017 Thanks!
(304.69 KB 1133x724 Screenshot_20250524_233632.jpg)

>>23018 > Yep, looks like a busted apng Indeed. Even Gimp cannot convert it to an animated GIF.
when I'm running a downloader and it encounters a deviantart post, it locks up the whole downloader because the "auth" cookie is missing, and seemingly just waits there forever until I manually cancel it. is there some way to just have it "fail" when something like this happens instead of waiting forever?
>>23018 >Aha, can you please check the 'eye' icon in your media viewer and ensure that 'apply ICC Profile colour adjustments' is on? Ah! that did it. Turns out the problem was on my end. Sorry about that.
>>>/v/1399962 >You can't view a anything on atfbooru without an account anymore from now on. Will importing with Hydrus still work? Is there a way to pass the login info to Hydrus?
>>23028 I thought importing and downloaders had been broken since they added anti-crawler/DDoS measures early this year. Late last year?
>>23029 Yeah, I think I remember reading that now. I'm never downloaded from ATF's booru, so I forgot that happened.
>>23019 I see, thanks!
Does anyone know how to download rule34.xxx favorites?
I had a great week. I fixed a major lag problem in duplicates auto-resolution, re-made a 'search page locked' mode that makes it easier to manage 'static' file pages, and we are folding in some important library version updates to the main build. The release should be as normal tomorrow.
I found a false positive pair for "definitely visual duplicates" where the files are actually alternates (one is a revision of the other). this is the first one I've encountered so far, or at least that I noticed. also I'm on v623 I tried to put them in an archive file so 8chan won't alter them, but it doesn't allow those. I'm not sure if you'll be able to reproduce it if the server does alter them, but here they are anyway. (they're nsfw)
>>23128 literally right after (2 pairs later) I encountered another false-positive of "definitely visual duplicates". it was art from the same artist (though this time It's full-color) and it was also a revision. I don't wanna keep posting them if I encounter more since that might be annoying, but let me know if you want that pair too
>>23129 >>23128 Yeah I just encountered a third. again same artist and again a revision, but this time the revision is a spelling correction on text instead of an adjustment to the characters in the art. anyway I'll stop reporting it for now, since clearly something needs adjustment in whatever algorithm Hydrus is using here.
>>23128 >>23129 >>23130 Thank you, this a great example. As it happens, I already tweaked the algorithm to detect this situation, and in the release I will put out very soon this pair now counts as 'probably not visual duplicates/(small difference?)'. Please hold on to your examples and check again in v624 and send me any that still fire false positive.
>>23131 okay glad to see you're fixing it up so much! you're doing great work!
https://www.youtube.com/watch?v=c40xH0780AA
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v624/Hydrus.Network.624.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v624/Hydrus.Network.624.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v624/Hydrus.Network.624.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v624/Hydrus.Network.624.-.Linux.-.Executable.tar.zst I had a great week. Duplicates auto-resolution is easier to preview, and I have re-introduced a way to 'lock' a search page in place. This release updates several important libraries in Windows and Linux. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html new build The Windows/Linux 'future build' test last week went well, no reports of problems. It seems like Win 10 will still run the program, although I suspect a very old (i.e. un-updated) version may have trouble. If you cannot boot the build today, please consider running from source and choosing an older version of Qt in the interactive setup: https://hydrusnetwork.github.io/hydrus/running_from_source.html If you use the zip or tar.zst, it does not seem like you have to do a 'clean install', but the build releases are changing a bunch of stuff so it is a good time to do one anyway: https://hydrusnetwork.github.io/hydrus/getting_started_installing.html#clean_installs If you run from source, this is a good week to rebuild your venv. Users on python 3.13 no longer have to choose the (a)dvanced install. locked pages A long time ago, when you opened files 'in a new page' or from a subscription files popup, the page created would have no search controls. It would just be a static page that could hold files. It worked well as a 'scratchpad' to work on, but you could not easily search the files if you wanted to. I replaced that mode with 'initialise the page with a system:hash predicate', which helped simplify things behind the scenes but makes it annoying to append new files or merge other pages into it, since the underlying system:hash stays stuck as what it was originally. Today we fix this. All search pages now have a 'lock' icon button beside the tag autocomplete text input. Click this, and the current search collapses to a system:hash for the current files in view and the search interface is replaced with an unlock button. It will keep track of when you add or remove files, and if you unlock, the system:hash is of what is currently in view. Have a play with it, and you'll see how it works. Whenever a new page is created with files--which usually means 'open in a new page' or a subscription file popup--it now starts in the locked state. The old 'no search enabled' behaviour is back, but if you want you can flip to a regular search with one click. Let me know how it goes! auto-resolution preview The duplicates auto-resolution 'preview' panel, when you were editing rules, was running way too slow. It could take thirty seconds to load up a count or a preview on a big client, every time you made a change. I've overhauled the whole thing to stream results in fast pieces, with pause buttons and faster cancel tech and better feedback. This system also now handles when the pair comparison takes a while to compute. It is pretty much all ready for 'A and B are visual duplicates'. I fixed a couple more false positives in 'A or B are visual duplicates'. There's one edge-detection situation that I poured a ton of time into and still failed to catch, so I'm going to keep thinking about it. Please send in any more weird pairs you come across! Auto-resolution rules in semi-automatic mode will now only queue up 512 items for 'ready to action'. This queue can take a while to build and resets any time you change the rules, so I'm limiting it to keep things snappy. You can change the limit or remove it entirely in the edit rule panel. misc Ratings look nicer and line up great again! A bunch of number-tests across the program have new 'less than or equal to' and 'greater than or equal to' operators. next week
[Expand Post] I only have one week before my summer vacation week, so I'm just going to fix little stuff and clean some code.
>>23131 I updated to v624 and both of the other 2 pairs I found are now caught as "probably not visual duplicates (small difference?)" so that's good! However, I did just now end up encountering another, and this one isn't caught by v624. it's by a different artist this time, but it's yet again, a revision pair that's being seen as "definitely visual duplicates" by the duplicate filter. I'm guessing the thing that's tricking the duplicate filter here is that one is a revision of the other, but when the artist revised it, he also decided to raise the resolution a little for some reason. so they're alternates, but if you zoom into some part that's not where he made the adjustment (it's her face by the way) then they look like duplicates from there. that makes me think that Hydrus is doing some kind of "look if these pixels are identical and if they are, but there's others that aren't, they're not visual duplicates" and it's failing here because of the resolution difference. Anyway, I attached them for you to check out. (they're kinda nsfw)
>>23134 Thank you, this is perfect. I'll see what I can do. My algorithm basically cuts the image into 16x16 tiles and then does some colourspace histogram profiling and comparison on each tile pair. I do some stats afterwards, looking at the mean, max, variance, and skew of the tile-pair difference-scores, and many common alternates or recolours or whatever pop out in one way or another. Unfortunately, when the alternate change is only subtle in shading difference, the difference gets lost in the statistical noise. I'm pursuing an edge-detector solution now, to catch clustered visual differences, but I had trouble tuning it last week.
Here's a batch of genuine false negatives. https://files.catbox.moe/64zvlg.7z Looks like it struggles with featureless backgrounds, especially transparent ones? I actually have a fuck load more pairs of the same variety as the transparent ones that are in here. Considering what you said in the upload notes, I wonder if this is the same issue as overly simple images being unworkable
Is there any way for me to add uppercase to the `filename:` tags? Sometimes I'll have a soundpost or something that needs uppercase in the filename for whatever reason but `filename:` turns everything into lowercase. Any way to bypass this? I could use notes, but it's kinda clunky to use that as a naming substitute
(517.78 KB 329x348 3e7e.gif)

>>23143 >Is there any way for me to add uppercase to the `filename:` tags?
>>23143 tags are always normalized to lowercase. when you think about the purpose of most tags (to describe aspects of the file so that you can search for) this makes sense since case isn't strictly necessary in written language (after all, it doesn't exist in spoken language) and it automatically removes many duplicate tags that could've otherwise existed. filename, title tags, and other tags like that are a bit different, because instead of describing an aspect of the file, they're essentially acting like a "property" or "field" of the file, and as a kind of external metadata. For these cases, where the text should be preserved as is with no modifications, I would say that notes are more suitable, but since afaik you can't search the contents of notes currently, it wouldn't be as useful. Ideally that'll be addressed in the future and you'll be able to search notes. why exactly do you want to preserve uppercase though? If it's for acronyms and initialisms, my current workaround is to replace every uppercase letter in the acronyms with `{lowercase equivalent}.` so instead of something like `IIRC` It'd be `i.i.r.c.` which looks a bit ugly but it communicates the idea.
>>23158 >why exactly do you want to preserve uppercase though? NTA but soundposts is a keyword for a 4chan extension that loads (catbox) links from filenames into a playlist. if the link is capitalized wrong it obviously wont find the correct site.
Is this the correct way to blacklist certain files from my subscriptions? Getting sick of seeing AI shit show up in my artist subs.
I noticed that when a gallery downloader is running and one domain's downloader has to fetch data from another domain, it'll always make the request immediately even if that domain is supposed to be blocked temporarily because of a bandwidth limit that you set. It'll say "overriding immediately" then do the request. Is there a way to stop Hydrus from doing that and respect the bandwidth limit? I don't see any information about it, or even a mention that Hydrus does this. I didn't click the "override bandwidth after 5 seconds" in the cog menu. It's just doing that on its own.
>>23158 Thanks for the extensive reply, I can see why tags are always lowercase. In my case, it's basically what >>23159 said: lots of soundposts in 4chan use capitalized words. Is it possible to export notes as filenames? That would also solve it, I believe.
Missed a couple weeks. I don't care about pixel-perfect comparisons, but do understand why people value that for preservation/deduplication reasons. >>22829 >wanting to know where a file came from PICREL That's why I manually edited my gallery download tab, and subscription downloaders, to add >website:gelbooru >website:e621 >website:derpiboor to the downloads. I never set up parent/child tags, so it doesn't show in search. If I manually add website:gelbooru it pulls up everything from that site. >>22840 >on broken databases and failing drives FYI for anyone here, ZFS is currently the only known filesystem solution that offers end to end data verification, and has bit rot protection. No other file system can protect against bit rot. (to my knowledge) >>22868 uhhh based? >>22878 Pretty pretty please can you tweak the existing exhentai downloader to download tags + download into a .cbz? :3 I can tip you! >>22963 I'm 9 days late but can test for you if still needed. Have a good vacation! You deserve it! >>23029 I will continue to advocate for Hydrus's original idea of being a p2p file sharing software! >>23137 I'm tarded and can't code, but it sounded like Dev-sama uses a tile based algorithm? That probably gets consolidated into a "detection score," and a bunch of 100% matching tiles throw it off. I hate the AI buzzword, but some kind of dynamic algorithm could be used? Thinking about an algo that throws out perfect duplicate squares, and focusing on different ones makes my brain hurt thinking how it could be used to find duplicates. >>23161 Looks right to me *shrug*. Go through your inbox and wait for an update to see if the inbox file is correct.
>>23133 >locked pages damn i love this
>>23167 >download into a .cbz This isn't possible for a hydrus downloader. But you could easily use gallery-dl to download plus a simple python script to turn it into a .cbz and create sidecars, and set up hydrus to import it.
>>23167 >ZFS is currently the only known filesystem solution that offers end to end data verification, and has bit rot protection I'm looking into it. Thanks anon.
Is there a way to lower the thumbnail top banners where you can put tags to be displayed on the thumbnails? since I have the ratings showing on thumbnails enabled, they completely block the text on the banner. alternatively, could the rating display be lowered? I just don't want them overlapping so I can see them both on the thumbnail.
>>23167 >>23167 BTRFS also has bitrot protection, you just need redundancy, which ZFS also requires to protect against bitrot.
>>23026 Yeah, sorry for the messy way these settings work. It is worse because you have a legacy DA situation that I've since cleaned up for new users. Hit up network->logins->manage logins, find anything DA related, and either hit 'flip active' or just delete it. Hydrus will no longer obey the 'rules' of that login script before it tries to hit DA. >>23073 I don't know if there is a hydrus downloader floating around that will do this (I don't obviously see one on the user repo), but I wonder if gallery-dl can extract your favourite URLs to a .txt or something? You could paste that right into a hydrus urls downloader page and be off. If rule34.xxx favourites aren't public, you'd have to have it log in for you. >>23137 Thank you, these are excellent. This 'small difference?' mode needs tuning. That mode basically says 'if there is an interesting tile (i.e. not a flat background) that is pixel-perfectly similar, then if there are any other tiles that differ, assume this is a subtle alternate and return negative'. This helps capture certain alternates with very subtle differences. I'll see if I can make it say '--if there are any other tiles with >x difference--' instead, allowing for very subtle jpeg encoding differences or whatever is actually different in your images here. I also want to pin down proper edge detection, which, if it works, will let me relax on a lot of the 'err on the side of negative' stuff. >>23143 >>23155 >>23158 >>23159 >>23165 Sorry for the frustration here! Tags are for searching in hydrus, and I agree that filenames and titles don't fit into them well. A different program, Tag Studio, has the metadata concept of a 'one line of text', basically a one-line note. I think hydrus will eventually add this type of metadata, allowing any unicode in it, and we'll have a grand migration of long-ass tags over to it. You can't export notes to filenames; I don't think there is a good automatic solution for your soundposts stuff. Manual copy/paste from a note is probably your best bet. >>23164 I'm afraid I force file downloads to happen within, like, 20 seconds or so because file URLs are sometimes given with time-out tokens, so I essentially pair them together to ensure an unlucky block time and delay doesn't cause a failed download later on. I can see that this sucks if the Post URL and File URL are on completely different domains so don't share anything. (so the Post URL isn't being blocked by the File URL's domain being maxxed out). I'm not sure what the clever answer here is. Probably an advanced setting in an URL Class or Parser. For now, I'll add a simple checkbox to options->downloading that disables the behaviour globally, so you can get fixed. >>23171 No, sorry, but one day I'll overhaul all this and let you put all the sub-widgets that decorate a thumb in any location.
>>23175 >I'm afraid I force file downloads to happen within, like, 20 seconds or so in my case, it's always instantaneous. it doesn't override after 20 seconds or something like that. it always overrides right away, as if there's not even a limit being exceeded at all. the only reason I even know that it can see the limit is because it says "overriding" for about half a second before it forces the download. if it happening instantly like this is still an example of what you mean, then yeah an option to just disable it for now would be appreciated. thanks! By the way in case it matters or might be useful to know, the reason I need it to work is because the domain that the post page fetches the files from is the one that the data is being taken from (of course) so when I want to set a data limit, I have to set it to the actual domain that it's being taken from. I can't set a data limit for the domain of the gallery (where the initial request is made and where Hydrus would actually respect the limit) because that's not the domain where the data is ultimately coming from, so Hydrus will never hit the limit if I set it there since it's looking at the wrong domain.
>>23176 Hmm, thank you, that's interesting. I'll look into what is causing the instantaneous override. Maybe it is three/five seconds I set it at or something, but there might be something else going on too. I wonder if it would be appropriate to register data usage for a (differing) spawning domain too. That's kind of a good idea, kind of not.
Here's an APNG Hydrus can't render "The file with hash "715ecb5c2a180e241db918a6056cb4c90a68ca6b35a790947bc18c8a73ebd7e6" seems to have failed to load in mpv. In order to preserve program stability, I have unloaded it immediately!" https://gelbooru.com/index.php?id=11949312&page=post&s=view
Back before I started using Hydrus I was using stupid filenames like "character name123" or "character name and character name456". When I imported everything I tagged them all with their filenames. Is there way I can view all files that have a specific word in their tag so I can mass edit them? For example if I type reimu in the search bar I see "reimu1" reimu2" "reimu and remilia5" etc pop up.
>>23183 yeah you can use wildcards, so in that case you can do a search for `filename:*reimu*` and it'll return any file with a tag in the filename namespace that has `reimu` somewhere in it
I think MPV worked for me in 620a last time. Now it's always "MPV is not available!" It's about Debian testing, and I think I tried both "n" and "t" for MPV.
>>23223 A bug report properly documented as shown at >>17224 may help.
>>23235 Thanks. Installing the mpv library/dev/python stuff made it work.
is there a way that I can get hydrus to download more tags from e621? at the moment its missing a good amount of them
>>23250 I just had an e621 subscription update and all the tags are there? Can you link the image and show in your client what tags are missing?
Is there anyway to bypass the Danbooru Gold requirement? I'd buy one but they haven't been sold for years at this point.
>>23252 A good number (most? all?) of danbooru's posts, including gold locked ones, are on gelbooru. You can find posts mirrored from danbooru with the user:danbooru search there. Other than that AFAIK there's only contributing to the site enough that they promote you to, IIRC, builder, which is equivalent to gold. Or waiting for one of the platinum raffles and participating in that.
>>23253 I know about Gelbooru, I'm just concerned that it will stay around in the future, it's like the only booru that still hosts loli mostly publicly that is accessible to hydrus. With the crackdowns on loli art and scraping recently I'm getting paranoid about the future access of Gel. Same with Danbooru too desu but I think it's less popular so it's a little bit safer in my eyes. I could probably get to builder or at least upload enough to not look like a leech in a platinum raffle though, I'll try it if I can find the time, thanks anon.
I had a great week. As I hoped, I cleared out a bunch of bad code and finished small todo items. I fixed some laggy list updates, brushed up some options and menu layout, and improved the feel and speed of several duplicates auto-resolution panels. The release should be as normal tomorrow.
Derpibooru has changed, now the downloader does not get much metadata.
https://www.youtube.com/watch?v=hJIlPvEI-sg
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v625/Hydrus.Network.625.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v625/Hydrus.Network.625.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v625/Hydrus.Network.625.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v625/Hydrus.Network.625.-.Linux.-.Executable.tar.zst I had a great week. I finished a bunch of small jobs. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights The subscriptions dialog is now much much faster when you do large edits like a 'separate' call or pasting many new queries into a sub. All lists that undergo mass edits are now faster, but this particularly applies to subs, where some jobs could add several minutes of lag--the same job now takes a couple of seconds. The page tab menu has a new 'collapse pages' entry, which lets you collapse the pages 'to the right' into one new frozen search page (and closing the old pages). If you have ten importers in a row, this lets you collapse them all into one thing you can process with. Try it out! file->shortcuts is gone! It is now under file->options->shortcuts. A bunch of the other options pages are cleaned and chopped up into easier-to-digest pieces. The horrid manage->file relationships thumbnail menu has had a small usability rework, and there's a new 'delete all false positives within this file selection' command. The 'A and B are visual duplicates' test is now careful to give a false result if either file has transparency--it isn't smart enough to deal with this yet. The 'preview' and 'review' panels in duplicates auto-resolution do their behind-the-scenes jobs better and spend less time doing nothing or hanging the UI. next week I am now on vacation for a week. v626 will be on the 18th. Thanks everyone!
>>23263 I was just going to post about this. It only added my manually added tag. I'm on v619 still lol. (warning porn) >>23265 TY dev-sama.
Fuck ddos-guard. rule34video now requires updating browser cookies every couple hours.
I created a "tag service" which really should have been a "file domain". Can I unfuck this somehow? Or is there a search filter which will show me all files in a tag service? like "service:generic"?
>>23313 Tags are in a tag service. Files can have tags in that tag service, but the files themselves are in file services.
>>23316 Maybe I worded that poorly. What I mean is, is there a special search which will show me all files with any tags in a tag service? Alternatively, all files ( * ), restricted to a single tag service I'd like to create a separate file domain instead with the results of that search
>>23318 Click the filter field, then "all known tags" and choose the tag service, then filter by number of tags.
>>23251 I get everything except for the "species" tags which I really need to organize everything.
Just noticed that my 4chan watchers give me 403 errors on specific files (similar to the deterministic failure above on kemono). Does this happen for anybody else?
Hey if you use the hydrus AUR package (Arch Linux, usually), you may have suddenly run into lots of file import errors because of a 'numpy' issue. Sorry for the trouble! I fixed some deprecated calls a few weeks ago, but I also missed others. A user kindly submitted a PR to fix it all, which I have now merged into master. You should be fixed on the next hydrus AUR package update. If that doesn't happen for a while, and if you can rewind your 'numpy' to version <2.3, which I think means 2.2.6, then do that and I think you'll be working again until everything trickles down to the AUR. If you never want to run into this 'latest' version trouble again, you might like to consider manually running from source, which fixes library versions in a virtual environment to what I know works, as here: https://hydrusnetwork.github.io/hydrus/running_from_source.html
>>23345 Are they all videos? Might be more cloudflare shit, Chance is broken because of it.
>>23341 Works on my machine (warning porn). I'm also still on v619. I'll upgrade and test tomorrow and see if it works.
>>23341 >upgraded to v625 >force update subscriptions Works on my machine. Have you confirmed the original post is properly tagged?
>>23349 You are right. That must be it. Thanks, anon.
Would it be a bad idea to store the image storage of Hydrus on a ZFS filesystem for that sweet, sweet compression? It might be only 1%-3% compressed for images, but that's still multiple gigabytes for 1TB of data. How about the database itself? SQLite isn't particularly space-efficient.
>>23357 A quick google search shows SQL works on ZFS. I'd focus on using ZFS for redundancy and bit rot protection, more than compression.
>>23362 I ask because the docs specifically hazard against using the database on a compressed BtrFS filesystem because it might corrupt. I know SQLite uses write-in-place, but that shouldn't really be a problem on ZFS which has a copy-on-write system with a transaction cache that can be multiple gigabytes large. It might be a problem during power loss, but it should be configurable to minimize the amount of write latency.
>>23363 The top few google results say ZFS cannot lose data on power loss (due to copy on write.) I bet power loss is more likely to corrupt a SQL database logically than at the filesystem level on ZFS. Plus you should have a UPS anyways. I'd always test and see if it works for a while before fully migrating to ZFS.
>>23363 There's a few mixed-up myths here. SQLite on BTRFS used to be too slow for production. One way to make it faster was disabling COW for the DB, but this also bypasses all the corruption prevention that BTRFS provides. That was the trade-off: If you don't need the corruption-prevention, you can turn off COW for a fast DB. BTRFS isn't as slow anymore. In many situations I've no problem using the default settings. Where the DB is the critical path, the best option is to switch to Write-ahead-log in SQLite. https://sqlite.org/wal.html `PRAGMA journal_mode=WAL;` Here's a post from one of the BTRFS devs: https://wiki.tnonline.net/w/Blog/SQLite_Performance_on_Btrfs
>>23319 Thank you very much!
>>23363 >the docs specifically hazard against using the database on a compressed BtrFS filesystem because it might corrupt I've been using Hydrus on compressed BtrFS for 4 years at this point, so I don't know why it says that. I've never had a problem, and the only user I heard who did had a failing drive I believe
Does anyone know is there are there alternative more up to date datasets for Garbevoir's ai tagger? Smilingwolf hasn't updated his dataset for nearly a year now.
>>23373 >had a failing drive I believe it was failing RAM at least
>>23346 Update on AUR Package: I have learned this is now going to be folded into the v626 AUR package, sorry for the delay and trouble! >>23181 Thank you, I get the same. FFMPEG can't generate a thumb for me either. I'm not sure why there are so many apngs like this--I guess there is a specific converter, or converter settings, that a bunch of people have used that are technically invalid or not well supported. I can't do much, I'm afraid, but having the example is very useful. A future version of mpv & FFMPEG may fix this magically. >>23243 Hey, can you remember any of the specific steps here, so I can add it to the help? Was this a libmpv1 vs libmpv2 issue? I know we are about to cross that change when I move the built Linux release from Ubuntu 22.04 to 24.04. >>23263 >>23282 Thanks I'll look for or do a fix and roll it into v626. >>23363 >>23366 >>23373 >>23375 Thank you, I will update the docs! Hydrus uses WAL by default, which is good news here.
Edited last time by hydrus_dev on 06/14/2025 (Sat) 19:51:59.
>>23357 >>23362 >>23363 There are many variables to consider with storage (e.g. reliability, backup, versioning, compression, latency, throughput), and different things have different storage requirements. I could slash about 50% of the database size very easily by cutting out 'duplicate' indices, but it would make many queries hundreds or thousands of times slower. There are four components of a hydrus database storage in this discussion: hydrus install directory - latency and throughput is most important for fast startup SQLite database files - latency and reliability most important. some form of backup critically important thumbnails - latency most important media files - backup most important, throughput is generally important, latency not important (0.1ms vs 8ms not noticeable on a per-file basis). files are big so cost of storage important So if you have a large sophisticated database that you want to spread across multiple partitions, as here, https://hydrusnetwork.github.io/hydrus/database_migration.html , I recommend putting the install, db, and thumbs on an expensive fast SSD, and your media files on a cheap HDD. As to the filesystem, that is your choice, but ensure that your db files are on a reliable and fast-responding filesystem that you can backup. If you can compress the live database safely, that still probably increases latency, so I recommend against it. Compressing database backups works very well though (a 7zip high quality compression can save 40-60% of space), so feel free to play around with it. I don't recommend trying to compress your media storage. As you say, it is only 1-3%. Don't think about this in absolute terms (multiple GB once you reach 1TB), since you are still bought in to the 1TB. Think about what that relative 1-3% compression costs you in CPU or directory scan overhead or whatever. Worth doing some human tests, but imo when it comes to media file storage, while it sucks to say, the simplest and cheapest solution remains 'just buy a bigger disk'. Given how much a 8TB drive costs, how many dollars or cents is saving 20 GB, these days? I don't know about you, but I regularly buy new disks to expand and replace my old, so that 1-3% would probably work out to 'buy that disk one week earlier', rather than 'you have saved real money'. And is it worth hours of setup and testing and potential headaches down the road? Is it worth the electricity cost of migrating and compressing 1TB, if that need be done? Generally, just cheap simple storage for files is best. Anyway, those are my general thoughts, and it is mostly just experience and intuitive feel rather than hard numbers. I don't know much about clever Linux filesystems, so I could well be wrong on one feature or another. If you try something out and get particularly good or bad performance, I'm interested to know.
>>23376 >>>23243 > Hey, can you remember any of the specific steps here, so I can add it to the help? Was this a libmpv1 vs libmpv2 issue? I know we are about to cross that change when I move the built Linux release from Ubuntu 22.04 to 24.04. There is a libmpv2 and not libmpv1. also, 625 can't render videos: ================ Exception ================ ValueError: The binary mode of fromstring is removed, use frombuffer instead ================ Traceback ================ Traceback (most recent call last): File "/home/hydrus/hydrus-625/hydrus/client/ClientRendering.py", line 1025, in THREADRender numpy_image = renderer.read_frame() File "/home/hydrus/hydrus-625/hydrus/core/files/HydrusVideoHandling.py", line 1026, in read_frame result = numpy.fromstring( s, dtype = 'uint8' ).reshape( ( h, w, self.depth ) ) ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^ ValueError: The binary mode of fromstring is removed, use frombuffer instead ================== Stack ================== File "/usr/lib/python3.13/threading.py", line 1012, in _bootstrap self._bootstrap_inner() File "/usr/lib/python3.13/threading.py", line 1041, in _bootstrap_inner self.run() File "/home/hydrus/hydrus-625/hydrus/core/HydrusThreading.py", line 450, in run callable( *args, **kwargs ) File "/home/hydrus/hydrus-625/hydrus/client/ClientRendering.py", line 1029, in THREADRender HydrusData.ShowException( e ) File "/home/hydrus/hydrus-625/hydrus/client/ClientData.py", line 174, in ShowExceptionClient ShowExceptionTupleClient( etype, value, tb, do_wait = do_wait ) File "/home/hydrus/hydrus-625/hydrus/client/ClientData.py", line 213, in ShowExceptionTupleClient message = HydrusData.PrintExceptionTuple( etype, value, tb, do_wait = False ) File "/home/hydrus/hydrus-625/hydrus/core/HydrusData.py", line 381, in PrintExceptionTuple stack_list = traceback.format_stack() =================== End ===================
>>23381 Sorry for the trouble! You will probably have trouble importing files too. This is this >>23346 issue. A user kindly submitted the fix, which is currently on master. If you are able to git pull to latest with your install there, you'll be fixed. If you cannot git pull, temporarily downgrading your (systemwide?) numpy will also fix the problem. If you cannot do that but want the fix now and are ok doing the edits yourself, check the small commit here: https://github.com/hydrusnetwork/hydrus/commit/82e71d9e2bd7b21661dbb1f062917f23fed6f874 Otherwise, the fix will be in v626.
>>23376 FYI this might help with the Derpibooru issue. imgbrg-grabber (different program) stopped downloading NSFW images without an API key entered. It's possible adding derpibooru to the login scripts could fix the tag problem. TY dev-sama.
Would be nice to have default options for importing manually. It used to be a bit simpler and faster a long time ago, but now I have to click 7 times to import tags from sidecar txts. It's always the same set of clicks, too.
>>23377 Filesystem-level compression shouldn't affect read latency or bandwidth whatsoever, which is what ultimately matters for storing all the media files, and filesystem-level compression should be reliable enough to just turn it on and forget it. In fact, with ZFS specifically, adding compression can actually be faster than having no compression because the compressed files write faster than uncompressed ones. And compression algorithms like LZ7 cost basically no CPU whatsoever, even in huge loads, so there's practically no reason not to use it. At worst, you lose nothing and gain nothing, since ZFS doesn't employ compression unless a significant enough per-file compression ratio can be achieved. Filesystem compression is also really simple if you already have ZFS, it's just a parameter toggled and henceforth all future files will be compressed with the specified algorithm. ZFS also has advantages with backing up and synchronization to other ZFS pools, which is how I'll probably back up all the media in the future. It's only applying compression to the DB itself that I'm concerned about, so I'll have to do some testing after backing up.
Is there any way to grab those translation tooltips from Gelbooru as notes?
Not Hydrus related, but hydrus users might like it. I found the software Lanraragi for managing manga. It's basically the same as Hydrus but for Mangas. Import .zip files, scrape websites for tags, then you get an ehentai style web interface.
>>23402 Lanraragi was one of the major things that pushed me to fully selfhosting everything I need, totally worth it if you can spare the money & time to set it up right. Also of note is Suwayomi, a server version of Tachiyomi/Mihon with Mihon extension support. I run both lanraragi and suwayomi as I use them for different purposes.
I had a good week. I fixed quite a few bugs and improved the accuracy of 'A and B are visual duplicates'. The release should be as normal tomorrow. >>23383 Thanks for letting me know. I managed to update the parser to grab the tags again and am able to see nsfw when I set no filter, so I'm not sure what that is about. I don't use the site personally, so let me know if you run into trouble with normal use.
I've known about Lanraragi for a real long time. I'll admit its very useful for what it does but for me personally, I didn't care for its setup. I just don't like browser focused programs. Personally I'd prefer something similar to hydrus for managing comics or old Happy Panda program. I'm surprised no one has made any by now.
>>23404 TY dev-sama, I'll test and let you know if the derpibooru Downloader is working again. >>23405 I was experimenting with using hydrus to manage manga, but it just isn't there yet, while lanraragi just works. You're completely right the setup is AIDS. Homebrew expects you to read the terminal and apply a command, instead of telling you to enter it on the instructions, and you also have to install perlbrew. Hydrus is so much easier because you just run a binary. IMO the browser interface is perfect and does anything you could need it to do.
>>23405 For comics, comicrack was the gold standard for the longest time before, and even after, it was abandoned.
https://www.youtube.com/watch?v=lkc2y0yb89U
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v626a/Hydrus.Network.626a.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v626a/Hydrus.Network.626a.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v626a/Hydrus.Network.626a.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v626a/Hydrus.Network.626a.-.Linux.-.Executable.tar.zst I had a good week. There's a bunch of fixes and some new stuff. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights The AUR (Arch Linux) release broke during my vacation. It was my mistake--I didn't catch something that was deprecated for way too long, and suddenly file imports wouldn't work--and I'm sorry! If you have a recent source installation or ran into the AUR issue particularly, please update to this v626 when you can and you should be good again. I improved the accuracy of my 'A and B are visual duplicates' test again. Almost all my false positive and false negative examples are fixed. I still have a little more work to do, but I'm feeling pretty good about it overall. I am still interested in seeing any more bad pairs, either false positive or negative. File export filenames are a bit smarter about maximum file and path length. If you do a lot of exporting with tags and your filenames regularly get clipped, it should work better now, particularly with unicode and patterns that produce subdirectories. There's a new setting under options->exporting, too, that lets you override how long your export filenames can be. The default derpibooru downloader stopped getting tags. I think I've fixed it, but let me know if there are still any problems. For a bit of fun, I added support for Paint.NET files (.pdn) this week. They should have resolution and thumbnails. If you are an advanced downloader maker, please check the changelog about the 'url slash test'. I'm interested to know if an URL-handling change will break anything on your end. next week More duplicates auto-resolution work and polish.
>>23414 Derpibooru is working again! TY dev-sama!
I'm trying to do something that's probably stupid. I want to create a subscription for a search on desuarchive. Basically it searches a certain board for threads with a certain subject. I've been able to make a parser for the search results that gives the URL of each thread and the next page, as well as a corresponding URL class. My problem is the URL generator. Desuarchive's search system is URL based, so for example a URL may look like this: desuarchive.org/%BOARD%/search/subject/%SUBJECT%/type/op/start/%DATE%/order/asc/ The problem is that if I go the dumb route and just have the search term be everything after the domain name, the URL generator wants to escape the slashes in the request URL. I don't see a way to disable this. Is there a better way to be doing this?
>>23418 Programming languages always have a mechanism for delimiting meta characters. For example the Linux bash shell will let you enter '~/space folder' or ~/space\ folder with the \ telling bash to treat the following space as a character and not break in the command. (This is an example IDK if \ is the right character I don't have a PC in front of me. Look up what characters your language uses for delimiting in relation to text/strings)
I noticed that when setting a file as better than another manually through the right click context menu of a thumbnail, the ratings act a little funny in that the new better image will have like/dislike rating while the old image will have it's like/dislike removed and any number rating is doubled it seems
I'm trying to load an previous backup but I'm getting stuck on this screen. I'm using 623. Open the menu -> load backup from database -> get stuck Any idea what's going on?
>>23421 Scratch that, it's working now. I'm a retard. ...that being said, maybe we could use a better status message here? "waiting for services to exit" doesn't tell me much when it's frozen.
>>23422 Sounds like the services were waiting to exit and you didn't wait long enough lol.
I think I forgot to mention by the way, but that solution for the problem where one domain has posts pages, but sends file requests to another domain, so bandwidth for the file domain can't be limited, worked great! Your first solution, to treat bandwidth usage of a file domain as also belonging to the post domain that forwarded the request, worked perfectly. Because it worked, I actually didn't need to try your second solution of disabling the force-override of file requests to download files. If I ever run into a situation where this first solution doesn't work though, I'll let you know if the second does. It's probably a good idea to have an "escape hatch" option to disable that behavior anyway, since I don't believe it's documented anywhere that Hydrus does that. that's what caused me so much confusion and made me keep looking for what I had set wrong, before finally coming to you for help. Seeing the option is a sort of implicit documentation that Hydrus does this if you don't disable the checkbox, so that's good.
>>23390 Thanks yeah, I hate how clunky this all is. I've been planning a complete overhaul of much of this, and favourites/defaults, but just haven't set the time aside. Same deal for favourite 'file/tag/note/whatever import options' generally for quicker importer setup. >>23391 I think you could sell me on this angle given I/O latency versus CPU processing time. I don't have experience with it myself, and if you say that most data can compress fast enough to make it worth it in saved I/O, I'll hesitantly believe you, but the back of my mind still thinks there will be some spiky latency that makes the KISS solution preferable. If you use hydrus on ZFS, I'm interested to know how it works out! >>23399 Not yet. We have some ideas, but the effort mostly petered out because it is a pain in the ass in a couple ways. For real, I wouldn't be surprised if Qt or OSes in general suddenly get native translation layers and the whole question will be solved in a different way. >>23402 >>23403 Looks really interesting! I'll add it to my help docs as a good thing to check out for comics/manga. >>23418 I think a long time ago, you could hack it by putting %2F instead of slash, but my URL encoding tech is more 'proper' now so this won't work any more. I know a couple of guys who rigged together a locally hosted server that would produce gallery urls on demand. So you'd tell a downloader to hit 'localhost:12345/get/watch_urls', and their external thing would handle whatever it was they were pasting or archiving or whatever and would spit out results. That's probably more than you want to deal with; unfortunately hydrus is just not great at multiple inputs yet. You might be able to do this with a GUG that sets '/' as the search terms separator, and then you'd be pasting "board search subject subject type op start date order asc". The encoding does not seem to apply to the search terms separator, so worth a test at least.
>>23420 I am afraid I do not get this when I try to replicate. Can you examine your default 'duplicate metadata merge options' for 'this is better' (hit the edit button on a duplicates page), and see how that lines up with your settings? When you say 'numerical' rating, do you mean an 'inc/dec' rating that you click to change the numerical count? I think that does a summing merge ('add worse to better' etc..) when asked to merge ratings. The 'like/dislike' and what I call 'numerical' ratings (as in 'three out of five stars') seem to transfer as the rules expect, be that copy or move. >>23421 >>23422 Sorry for this. I hate how this 'workflow' operates. It was always hacky, and I haven't kept up with my maintenance as other things have changed. Since it is an important thing that really shouldn't ever go wrong, I have been thinking about abandoning the internal backup and restore system entirely and just saying 'use freefilesync' from here out. What do you think of that? It would be work to transition users like you to using a third party program, but that program (or something equivalent) is always going to work better than what I can cobble together. (At least until I completely overhaul the program's internal structure and add switchable profiles or something. atm I connect to the db very early in the program bootup and disconnect very late, so if I do a big db operation, I either have the awful pause for backup or this awful presentation for restore, where it seems any updates I am pushing to the splash are probably not getting drawn to screen because I already suspended the daemon that handles that) >>23424 Hell yeah! Thanks for letting me know.
>>23428 NTA. I doubt the compression vs CPU latency matters. Only when Hydrus is P2P and ran at scale will that start mattering. >translation layers No that anon was talking about "notes" added to images. Since Gelbooru is Japanese, people will put English translation notes over text. e621 does the same thing when artists have typos. Example: https://gelbooru.com/index.php?page=post&s=view&id=12154000&tags=translated+ >can't find one for e621, but it's the same thing >adding lanraragi to hydrus notes epic! >>23429 >freefilesync NTA. I enjoy a built in backup system, because in theory you can always corrupt a database by forgetting to close Hydrus and running external backup software (on purpose or automated schedule.) Hydrus's backup will always work. It doesn't need fancy features. People can manage their own advanced backup program if they want advanced backups. Awful pauses on backup/restore are a feature not a bug. You are guaranteed good backups.
Oh I see what dev-sama was saying. They think OSes will have built in image translation eventually anyways. I disagree it would be out of scope, only because It's information the boorus serve to you anyways. It's the same as any other information boorus serve users.
>>23432 Yeah, you could easily make the parser grab translations as notes, but then they would just all be at the side and not correlated to a particular part of the image. Makes it pretty pointless if the image has multiple characters talking or god forbid panels. You might be able to add x and y coordinates to the note, but that just seems confusing? And some images with notes have special html, like bold or italics that would have to be displayed properly. That's why note parsing isn't in the defaults. That's the earlier conversation hydev was talking about - discussion about potentially adding the same translation layer thing that boorus have where notes can appear on the image in specific places.
>>23440 >but then they would just all be at the side and not correlated to a particular part of the image. Makes it pretty pointless if the image has multiple characters talking or god forbid panels That's better than nothing. Right now I'm manually copy pasting the text of each speech bubble. This would reduce the work load to just making sure the sentences are in order.
I had a great week. The 'A and B are visual duplicates' system has a significant improvement and is enabled for testing in the duplicates auto-resolution system. The release should be as normal tomorrow.
>>23445 You're doing fantastic work! I simply thought that automatically handling any duplicates that weren't pixel-for-pixel wouldn't be possible programmatically (without advanced machine-learning at least) but I'm shocked by how accurate the "checker" here is at high confidence (like `almost certainly` and `nearly perfect`). the only real issue I see here is that while Hydrus can tell in many cases that a pair of files are a duplicate pair, it doesn't have any way to tell which one is the better one. If it's a case of the resolution being different, then in most (but not all) cases, the higher one is better so you can just bake that into the auto-resolution rule, but for cases like lossy compression, I don't see any easy way. Though I'm not sure that anything can be done about this. thankfully, there are sites that tend to produce good dupes frequently and sites that tend to produce bad dupes frequently, so a rule that checks that e.g. A is from Pixiv and B is from Xitter is probably safe as long as they're definitely not alts
>>23440 IMO hydrus should scrape all information served by boorus. Translation tool tips, scores, and maybe even comments. >>23445 TY dev-sama. >>23451 You might not be able to train an AI to find the better one, because JPEG artifacts might get confused with a weird style choice in a different image.
>>23428 >I think a long time ago, you could hack it by putting %2F instead of slash, but my URL encoding tech is more 'proper' now so this won't work any more. My current solution that Works Well Enough is to use curl to grab each page, grep to filter out the post IDs, sort and uniq to remove any duplicates (there probably are none I just did this out of habit), sed to add the beginning of the URL to them, then copy/pasting that into a watcher page. Could probably do all that in a perl oneliner but I can't be assed. Sadly though it seems many 4chan archives block curl even if you fake a user agent, it works on desuarchive with a firefox UA but not thebarchive for example.
>>23441 >>23440 >>23432 Yeah, this generally reflects the ideas I've had around here. It would be nice to have, and we'd probably bodge it by saving the json or whatever to a hidden note and then hack some hardcoded note renderer into the media viewer that would draw this note to screen if it existed. I am still somewhat interested in trying this, but I think it might be years, and if all OSes suddenly just get a translation magnifying glass tool, then it might become moot. Or at least less important. >>23451 Thanks. The release I'm about to publish fixes my last false positives, stuff like a sweat drop that moves a little bit. I'm really pleased with how all this went. I just bashed my head against a keyboard and figured something out. ChatGPT was helpful in generating some algorithm ideas and walking me through statistical and histogram math that I was rusty on, and numpy channel slicing that I'm always bad at. This algorithm might now have too many false negatives, but that's a nicer problem to work on. There's lots I still want to do. I totally agree that we now want a bunch of nicer comparators like 'A has higher jpeg quality than B' and stuff so we can shape the ultimate processing side of this. I'm going to keep churning out new tools for the foreseeable future. Let me know what works and what doesn't.
https://www.youtube.com/watch?v=70N21Mynaa0
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v627/Hydrus.Network.627.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v627/Hydrus.Network.627.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v627/Hydrus.Network.627.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v627/Hydrus.Network.627.-.Linux.-.Executable.tar.zst I had a great week. The "A and B are visual duplicates" test is ready for real use. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html A and B are visual duplicates I figured out a spatial edge-comparison component to the "A and B are visual duplicates" test this week. It worked out really well to detect small alterations that the pure colour comparison was struggling with, and I solved my remaining false positive examples reliably. This test is now ready for proper testing and is now an available comparator in duplicates auto-resolution. I have added a new suggested rule in duplicates auto-resolution called 'visually similar pairs - eliminate smaller' that uses it. If you have been following along with this tech, please try it out and let me know how it goes. Leave it on semi-automatic, and be aware that this test is pretty CPU heavy--about one second per pair test. I'm interested in how the UI handles this sort of lag IRL, where the comparators goes wrong (e.g. do we want the filesize/resolution test to be x1.4?), and obviously if you encounter any more false positives, please send them to me. I've just played with it IRL and I think it might benefit from a 'A imported earlier than B' test too. This was the last difficult part of duplicates auto-resolution. I still have lots I'd like to do, but it will all be smaller features and polish. I will revisit the help and do a grand launch for all users once we know this is all tuned well. future build Only for advanced users! I am making another future build this week. This is a special build with new libraries that I would like advanced users to test out so I know they are safe to fold into the normal release. More info in the post here: https://github.com/hydrusnetwork/hydrus/releases/tag/v627-future-01 next week I'm going to take an easy week and focus on simple code cleanup.
>>23454 >ChatGPT <not FOSS I'm disappointed, Dev.
>>23455 I'll test the advanced one tonight. >>23457 Chat GPT has access to web resources. Offline text models hallucinate too much to be useful.
>>23455 v627-future-01, source build used. Tested on Fedora 42. >Does it boot? yup >If mpv works in your real client, does it still work here? yup >How do your images look? Any incorrect colours or static? looks fine to me on my 4k monitor >Do you have any UI issues, particularly stuff like menu positioning on a multi-monitor setup? Everything looks and works fine. Changing the window scale, snapping, opening images, switching screens. IDK what the point of the "source" build is. Linux "executable" fails to run mpv on Fedora while the "source" runs mpv fine.
>>23430 > I doubt the compression vs CPU latency matters It's a very simple equation. You have the drive read speed, the decompression speed, and the maximum acceptable CPU load increase. If you can decompress faster than you can read, then it matters. P2P and scale have nothing to do with it. The numbers vary for every user though, and will change when he upgrades components or even just if MSTeams updates and it requires another 2% CPU sacrifice. If you go further you can look at hydrus specifically. Does hydrus show high CPU usage while reading data? Then adding extra decompression load prob' isn't a good idea. Again, usually only the user can know what trade-offs are worthwhile.
I'm still still testing the "visually similar" auto resolution, but 2 quick thoughts. Without a duplicate-filter like view to flip between the 2 files in a pair with the locked zoom and pan and with the helpful additional info while in semi-automatic mode, comparing them to see if it matches the decision I would've made (basically if it's a better/worse pair and it's in the right order) is very painful. I honestly wouldn't bother for now normally, but I wanna test anyway to help you get feedback. I think you said you're already working on fixing this soon so no rush, but it's a headache for me as it is. there's also a bug in the auto-resolver where it doesn't count files delete from there as being deleted from the duplicate filter for the purposes of that "re-add to inbox" exception that you recently added for the deletion lock. If that's on purpose, because auto-resolution is different from manual-resolution then that's fine and honestly makes sense, but if so, then I'd like a checkbox for the auto-resolver to re-add to inbox like the one for the archive/delete and manual duplicate filters.
>>23465 >Fedora 42 >mpv works I thought it was fucked on Wayland?
>>23455 Hydrus.Network.627-future-01.-.Windows.-.Extract.only Windows Server 2022 21H2 >Does it boot? Yes >If mpv works in your real client, does it still work here? Yes >How do your images look? Any incorrect colours or static? Not that I can tell. Looks good to me. >Do you have any UI issues, particularly stuff like menu positioning on a multi-monitor setup? No issues.
>>23476 I don't use Wayland because it fails to capture the mouse in full screen games on a multi monitor setup. You have to specifically alt-tab back into the window and I'm not doing that every time I leave a game window.
Would you mind adding a button to duplicate an auto-resolution rule? I make a lot of rules that's just an alteration of an already existing rule, so being able to duplicate it then make the change would make the process much faster for me.
(2.66 KB 600x450 great plan.webp)

I haven't checked into Hydrus news since November: what is the current attitude towards syncing files and tags between devices? Last I remember, Hydev suggested this might become deprecated. If that's the direction it's going in, I might experiment with a personal Shimmie booru for it since this repo doesn't need downloaders or advanced features. (I will still be using Hydrus as well for my other device specific repos since they do use downloaders and other features)
* To explain that better: In an ideal world, I would open up Hydrus on any of those devices and get the exact same result- same files, same tags, same order.
>>23476 I'm on Fedora 42 with Wayland (I use Niri), and most things work as expected! The only weird thing I'm seeing is with the right click menu that, if clicked too low, will have elements below the edge of the screen; probably a QT bug and not a hydrus one. I use wayland-satellite, and start Hydrus with `DISPLAY=:0 WAYLAND_DISPLAY=` in my environment; everything just works, including mpv.
>>23484 I've used a hydrus server, and I think the remote tag service works great; That won't go away either. For syncing the files themselves, it works but it doesn't feel like syncing. The remote file service is more like sending files over manually. Instead of a hydrus remote file service, there's probably a better setup with export/import folders and a separate folder-syncing solution like syncthing. > same order Maybe you need to start thinking about remote desktop, vnc, or X11 forwarding to access the same hydrus instance from several computers. You'd be missing offline access though.
>>23484 did you freehand that mouse? It's cute.
>>23488 Yeah, took a few tries. Thanks anon. >>23487 I didn't even think of X11 forwarding. I mainly use the repo for importing, tagging, searching and exporting, or drag-and-dropping into a browser upload form so I suspect remote solutions like that will complicate this too much for my use-case. I am interested in the folder syncing approach, since I'm only using one client device (laptop, desktop) at a time and I don't mind restarting the client to sync. Since the client is portable, it might be viable to just sync the whole Hydrus folder between the two devices on-demand, I don't think the database would mind, right? The only problem I can think of yet is if it were to sync while Hydrus was running, which is why I would do sync on-demand instead of automatically.
>>23492 You can sync the client_files, but the database won't know you have the files. You really should NOT sync the sqlite database while hydrus is running. > I'm only using one client device at a time It'd generally be easier to store it on a thumbdrive or SD card if you wanted to take the database to multiple computers. If you have something like a yubikey, the UX would be similar. If you really think you have discipline, then yes, you could theoretically sync the database while neither client device is running hydrus. There's nothing to prevent that from working but user error. How good are you at backups? "never missed one, I rollback all the time"?
>>23493 You're right, with something as critical as a db I shouldn't be relying on discipline. "Almost always" isn't good enough for an archive. Thumbdrive is a good idea, this collection will easily fit.
* I wonder if making a simple lock script would work as a patch to hydrus_client.sh (or hydrus_client-user.sh). - create lock file my homeserver - create last-used file my homeserver (which device last used Hydrus) - Hydrus launcher on device reads lock file from server, confirms it's unlocked (with no error) or it holds the lock (after unclean exit, like battery failure), then puts device name in the file to lock it - If the last-used device isn't this one, sync - After hydrus_client.py exits (line 29), copy the lock file to the last-used file and then clear the lock file
>>23495 for larger collections you can sync the client_files folder to the same directory on both clients, and put the DB and program files on the thumbdrive. Hydrus has enough controls to handle all the failure modes there. I.e. `database->move media files...` >>23496 It's got a few edge cases as does all network-locking, but yeah, it could be a good compromise for people who must sync and are technical enough to set it up.
>>23457 I've played around with some of the open source text models and I'm of two minds. It is neat to gen your own text tokens, but the last time I checked they just hallucinated too much (or crashed mid-answer) compared to the stuff from the big guys. Haven't looked in about nine months though, so I'm sure I am out of date. My suspicion is that it is a hardware limitation for now, so my hope is that we'll see more consumer-level AI hardware and we'll get 128GB video cards or dedicated chips or whatever we'd need for truly powerful local gen. I'd be open to spending some serious dosh for a machine I could have doing jobs in the background like clever image upscaling or answering questions or writing unit tests or whatever, but only once the models get good and there's some sort of generally convenient control layer that lets us run it all. >>23465 Great, thank you! The executable build is for convenience, for people who don't want to figure out a python venv and stuff. I've thought about retiring it. As you say, since it is built on Ubuntu, the further you stray from that, the less reliable the communication layer between the frozen code and your OS's API calls. Maybe I'll retire it when we have a better ecosystem of flatpak and other one-click automated source installs, where I'll have something easy to point to as alternative. Hydrus has a flatpak now, but someone else manages it and I think they have to do some weird hacks to make it all work. An AppImage is another thing I have heard about, but I don't know much. >>23474 Yep, 100%. I was doing the same thing and I want all those tools available. Thank you for the bug report as well. I'll check it out. >>23477 Great, thank you! >>23483 Yes, very good idea. >>23484 >>23485 No good solution yet, sorry! I've still got it in mind (probably by having a master client that other clients can dial into), and the Client API is essentially ready to handle this sort of thing, but it'll take a bunch of work. For now, maybe check out Hydrus Web: https://github.com/floogulinc/hydrus-web https://hydrus.app/welcome
>>23492 Just as an idea, I once knew a dude who had a tablet that he would VNC to his hydrus machine and he had some app that put a shortcut button overlay over his screen, and he'd hit all these buttons to tag and archive/delete filter his stuff from the couch. I want to make this sort of thing easier if possible. >>23496 If it helps, Hydrus makes a 'client_running' file in the db folder while it is open--it uses this for 'already running' and 'last exit was not clean' checks--which should contain the PID of the process running it and I think process start time (accurate to 0.5s I think, to help anyone reading it line things up from a different clock source). I know a different guy who inspects this and does syncing stuff of some sort with it. If you are able to query the PIDs from another computer, you could probably figure out an 'already running' test that worked across multiple machines. I don't think this helps you, but you can also command the database to lock itself and disconnect from the db: https://hydrusnetwork.github.io/hydrus/developer_api.html#manage_database_lock_on This is meant for making backups while the program is running. You absolutely must not ever swap in a new database while in a locked state and then unlock the db.
>>23487 Wasn't there an unofficial Hydrus docker container that you could VNC into? Usually I think this workflow is kinda stupid but it would work well for Hydrus.
Hydrus consistently gives that "mpv failed to load {hash} so I've unloaded it" error message on many files when I have that TEST option for stopping mpv media before switching enabled. It doesn't happen once I disable the setting. I'm running from source on Fedora 41 KDE, on X11
I noticed that Hydrus considers a chromaticity chunk in a png image to be an "ICC" profile for the purpose of the system predicate. afaik they're two separate chunks and chromaticity is obsolete, but I kinda get why hydrus would just lump them together. However for some reason, it doesn't seem to do the same for an SRGB chunk, which similarly fulfills the role of specifying color. In fact from what I know, if image is using the sRGB colorspace, you can just use an sRGB chunk and not need any other color profile chunks. for consistency, could you either remove chromaticity chunks as being considered ICC profiles, or add SRGB chunks as being considered them? I know this seems like a very specific issue, but I have a workflow that requires the presence or absence of this to be consistent, so it'd be helpful.
good morning sirs. I've got a question re. watchers. I currently have a couple watchers that sometimes get new files. To keep things organized after I archive/delete, I clear the page. But if I switch to another watcher and back, the archived files are displayed again. It's the same when there are new files, and when there are no new files. I understand that the files are associate with the watcher, but if I explicitly say to remove them from the view, I'd expect them to not reappear. Am I doing something wrong here? I don't think I touched any of the settings.
>>23517 when you remove a thumbnail, you're literally removing it from the current "view", not taking it out of the watcher. the file is still in the log, so when you switch to the watcher again and Hydrus checks which files are in this watcher, it'll see that file and put it right back into the view again. If you specifically want archived files to not be displayed, you can just change this directly in "import options → file → presentation options". the second dropdown menu can be changed to "must be in inbox" so that archived files won't be displayed.
I updated my venv, and made the serious mistake of not backing up my db beforehand. instead, I just copied the venv itself in case it doesn't work so I can revert. when booting after the venv update, I got this very scary looking error v627, 2025-06-29 22:34:18: hydrus client started v627, 2025-06-29 22:34:19: booting controller… v627, 2025-06-29 22:34:19: booting db… v627, 2025-06-29 22:34:19: checking database v627, 2025-06-29 22:34:19: An object in a list could not load. It has been discarded from the list. More may also have failed to load, but to stop error spam, they will go silently. Your client may be running on code versions behind its database. Depending on the severity of this error, you may need to rollback to a previous backup. If you have no backup, you may want to kill your hydrus process now to stop the cleansed list ever being saved back to the db. v627, 2025-06-29 22:34:19: ================ Exception ================ SerialisationException: Could not initialise this object of type File Search Predicate! ================ Traceback ================ Traceback (most recent call last): File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 340, in InitialiseFromSerialisableInfo self._InitialiseFromSerialisableInfo( serialisable_info ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^ File "/home/user1/hydrus/hydrus git/hydrus/client/search/ClientSearchPredicate.py", line 638, in _InitialiseFromSerialisableInfo self._value = tuple( sorted( HydrusSerialisable.CreateFromSerialisableTuple( serialisable_or_predicates ), key = lambda p: HydrusText.HumanTextSortKey( p.ToString() ) ) ) ~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user1/hydrus/hydrus git/hydrus/client/search/ClientSearchPredicate.py", line 638, in <lambda> self._value = tuple( sorted( HydrusSerialisable.CreateFromSerialisableTuple( serialisable_or_predicates ), key = lambda p: HydrusText.HumanTextSortKey( p.ToString() ) ) ) ~~~~~~~~~~^^ File "/home/user1/hydrus/hydrus git/hydrus/client/search/ClientSearchPredicate.py", line 1953, in ToString service = CG.client_controller.services_manager.GetService( service_key ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'GetService' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 683, in _InitialiseFromSerialisableInfo obj = ConvertMetaSerialisableTupleToObject( meta_tuple ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 383, in ConvertMetaSerialisableTupleToObject obj = CreateFromSerialisableTuple( serialisable ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 196, in CreateFromSerialisableTuple obj.InitialiseFromSerialisableInfo( version, serialisable_info, raise_error_on_future_version = raise_error_on_future_version ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 350, in InitialiseFromSerialisableInfo raise HydrusExceptions.SerialisationException( 'Could not initialise this object of type {}!'.format( self.SERIALISABLE_NAME ) ) hydrus.core.HydrusExceptions.SerialisationException: Could not initialise this object of type File Search Predicate! ================== Stack ================== File "/usr/lib64/python3.13/threading.py", line 1012, in _bootstrap self._bootstrap_inner() File "/usr/lib64/python3.13/threading.py", line 1041, in _bootstrap_inner self.run() File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusThreading.py", line 450, in run callable( *args, **kwargs ) File "/home/user1/hydrus/hydrus git/hydrus/client/ClientController.py", line 2482, in THREADBootEverything self.InitModel()
[Expand Post] File "/home/user1/hydrus/hydrus git/hydrus/client/ClientController.py", line 1194, in InitModel HydrusController.HydrusController.InitModel( self ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusController.py", line 606, in InitModel self._InitDB() File "/home/user1/hydrus/hydrus git/hydrus/client/ClientController.py", line 225, in _InitDB self.db = ClientDB.DB( self, self.db_dir, 'client' ) File "/home/user1/hydrus/hydrus git/hydrus/client/db/ClientDB.py", line 254, in __init__ super().__init__( controller, db_dir, db_name ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusDB.py", line 440, in __init__ self._RepairDB( version ) File "/home/user1/hydrus/hydrus git/hydrus/client/db/ClientDB.py", line 6208, in _RepairDB new_options = self.modules_serialisable.GetJSONDump( HydrusSerialisable.SERIALISABLE_TYPE_CLIENT_OPTIONS ) File "/home/user1/hydrus/hydrus git/hydrus/client/db/ClientDBSerialisable.py", line 340, in GetJSONDump obj = HydrusSerialisable.CreateFromSerialisableTuple( ( dump_type, version, serialisable_info ) ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 196, in CreateFromSerialisableTuple obj.InitialiseFromSerialisableInfo( version, serialisable_info, raise_error_on_future_version = raise_error_on_future_version ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 340, in InitialiseFromSerialisableInfo self._InitialiseFromSerialisableInfo( serialisable_info ) File "/home/user1/hydrus/hydrus git/hydrus/client/ClientOptions.py", line 855, in _InitialiseFromSerialisableInfo loaded_dictionary = HydrusSerialisable.CreateFromSerialisableTuple( serialisable_info ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 196, in CreateFromSerialisableTuple obj.InitialiseFromSerialisableInfo( version, serialisable_info, raise_error_on_future_version = raise_error_on_future_version ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 340, in InitialiseFromSerialisableInfo self._InitialiseFromSerialisableInfo( serialisable_info ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 486, in _InitialiseFromSerialisableInfo value = ConvertMetaSerialisableTupleToObject( meta_value ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 383, in ConvertMetaSerialisableTupleToObject obj = CreateFromSerialisableTuple( serialisable ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 196, in CreateFromSerialisableTuple obj.InitialiseFromSerialisableInfo( version, serialisable_info, raise_error_on_future_version = raise_error_on_future_version ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 340, in InitialiseFromSerialisableInfo self._InitialiseFromSerialisableInfo( serialisable_info ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 486, in _InitialiseFromSerialisableInfo value = ConvertMetaSerialisableTupleToObject( meta_value ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 383, in ConvertMetaSerialisableTupleToObject obj = CreateFromSerialisableTuple( serialisable ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 196, in CreateFromSerialisableTuple obj.InitialiseFromSerialisableInfo( version, serialisable_info, raise_error_on_future_version = raise_error_on_future_version ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 340, in InitialiseFromSerialisableInfo self._InitialiseFromSerialisableInfo( serialisable_info ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 690, in _InitialiseFromSerialisableInfo HydrusData.ShowException( e ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusData.py", line 369, in PrintException PrintExceptionTuple( etype, value, tb, do_wait = do_wait ) File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusData.py", line 400, in PrintExceptionTuple stack_list = traceback.format_stack() =================== End =================== I can't post the whole thing because it's too long, but I didn't notice the part way at the top about how I should kill the Hydrus Client process until about 30 seconds later. I did as soon as I noticed but it probably got saved. I then restored the old venv, but the same error occured. I then made a new venv with safer versions (I hoped anyway) and I tried one more time, and I didn't get the error this time. I booted Hydrus... everything looks fine. my files are all here, and I don't see anything broken. my session is even still intact. anyway Dev, how fucked am I? I see that the error is about a search predicate, which doesn't sound very dangerous, but I don't know how Hydrus works internally, so I don't know how dangerous this might be. also sidenote, I never thought an error this bad could happen from a venv update. I didn't update Hydrus itself. it was already on 627. I only updated the venv. needless to say, I'm not doing that again without a backup. looks like even a venv update isn't safe
>>23521 well I did some searching through the logs, and it turns out that on the 10th while I was on v625, I actually got essentially the same error and never noticed. v625, 2025-06-10 12:26:01: hydrus client started v625, 2025-06-10 12:26:04: booting controller… v625, 2025-06-10 12:26:04: booting db… v625, 2025-06-10 12:26:04: checking database v625, 2025-06-10 12:26:04: An object in a list could not load. It has been discarded from the list. More may also have failed to load, but to stop error spam, they will go silently. Your client may be running on code versions behind its database. Depending on the severity of this error, you may need to rollback to a previous backup. If you have no backup, you may want to kill your hydrus process now to stop the cleansed list ever being saved back to the db. v625, 2025-06-10 12:26:04: ================ Exception ================ SerialisationException: Could not initialise this object of type File Search Predicate! ================ Traceback ================ Traceback (most recent call last): File "/home/user1/hydrus/hydrus git/hydrus/core/HydrusSerialisable.py", line 339, in InitialiseFromSerialisableInfo self._InitialiseFromSerialisableInfo( serialisable_info ) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^ File "/home/user1/hydrus/hydrus git/hydrus/client/search/ClientSearchPredicate.py", line 638, in _InitialiseFromSerialisableInfo self._value = tuple( sorted( HydrusSerialisable.CreateFromSerialisableTuple( serialisable_or_predicates ), key = lambda p: HydrusText.HumanTextSortKey( p.ToString() ) ) ) ~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user1/hydrus/hydrus git/hydrus/client/search/ClientSearchPredicate.py", line 638, in <lambda> self._value = tuple( sorted( HydrusSerialisable.CreateFromSerialisableTuple( serialisable_or_predicates ), key = lambda p: HydrusText.HumanTextSortKey( p.ToString() ) ) ) ~~~~~~~~~~^^ File "/home/user1/hydrus/hydrus git/hydrus/client/search/ClientSearchPredicate.py", line 1953, in ToString service = CG.client_controller.services_manager.GetService( service_key ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'GetService' (I cut the rest because it's basically the same as the error from the post above) I guess that means that this isn't something catastrophic, assuming it really is the same error. but I really don't like the idea of something just being broken somewhere in my db.
>>23506 Apple *spits* is kinda pushing that direction with their ARM desktops that have unified memory. If you have 256gb of shared ram-vram, then it stops mattering as much. >syncing files between devices This should be built into the foundation for kinda competing with ehentai's hentai@home software. I don't see any point in letting my phone sync with my desktop, if it also isn't applied more generally. Internet censorship, boorus restricting access due to scraping tools, etc. >>23492 Just use VNC >>23517 Are you misusing a watcher, or maybe I'm confused? I use a gallery download page to get everything from someone, then once the download is done add them as a subscription. 1 subscription downloader per site, and each downloader filled with 1 tag subs for artist names.
>>23521 >>23523 okay, I ran the DB integrity check overnight and it says that there were 0 error found. I guess that means that my DB is actually fine. so hopefully that means that I'm safe, but then I don't know what that error is about or what's causing it.
>>23525 It's saying it ran into some search objects different from what it expected and they couldn't be loaded. You probably just lost a single page in the session, no actual data issues.
>>23521 >>23523 >>23525 Hey, I'm sorry for the error and related panic here. Your database is fine, this is all me. It looks like there is a dumb bug when loading a system:rating predicate that is inside an OR predicate too early in program boot. I'm not sure if this would be a database update step or some initial part of the Main GUI load. If any of your sessions or favourite page loadouts or anything hold something like an OR with a rating pred inside, that is probably what is being hit here. It may have deleted/ignored that page or that pred from the active search predicates on that page, but everything else around it probably loaded ok. I've fixed the bug for v628. I'm doing more ratings work tomorrow so I'll give it another look just to be sure. Please let me know how v628 goes.
Could it be possible to add export patterns for modified times and such? I can just write a little script to do it myself since the exported files have the proper times but it would be nice if Hydrus could do that.
I had a very good week. I cleaned a ton of code, fixed bugs, and improved some quality of life. There are several more rating UI features, too. The release should be as normal tomorrow.
>>23526 >>23527 thanks guys. it's good to hear that it's nothing serious. I'll update when v628 is released and see if that stops happening then
I don't know if it's just me, but it seems like in version 627 if you use the keyboard to switch tabs, the keyboard focus will sometimes not be put on the file thumbnails.
https://www.youtube.com/watch?v=ByMrACzUCzk
[Embed]
windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v628/Hydrus.Network.628.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v628/Hydrus.Network.628.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v628/Hydrus.Network.628.-.macOS.-.App.zip linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v628/Hydrus.Network.628.-.Linux.-.Executable.tar.zst I had a great week cleaning old code and fixing/improving small things. There are also more neat ratings UI options. This release folds in the library updates we tested last week. The test went well and there are no special install instructions, but let me know if you have any trouble. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html ratings Thanks to a user, we have another round of ratings UI features. A new options->ratings page lets you edit the size of ratings in every location, including separate sizes for inc/dec ratings, and in the 'manage ratings' dialog, and you can now choose to collapse the thumbnail numerical ratings display to just '3/5 (one-star)'. Under services->manage services, you can now add this '3/5' text to the left or right of any normal numerical ratings display. Also, thanks to another user, we have a bunch of new default ratings svgs. I think they look great. Also, under options->media viewer, you can now choose to show the 'top-right' hover window in the preview viewer! So, you can now quick-set ratings with one click. Try this out--I like it so much that I think I'll soon make it default for new users. duplicates quality of life Duplicate pages now count up their searches' pairs with my new incremental duplicate fetch tech. Touching a duplicate page will no longer lock your database for a couple seconds, and there's pause and fast cancel-reset tech in there too. Should just work better all around, but let me know if it ever under-counts. Semi-automatic duplicate rules similarly commit large approve/deny commands in pieces, rather than blocking the database for the whole job, and they update the UI with a 'approving: 16/55' label so you can see how they are doing. There's also a 'select all' button. next week I've got a lot of ideas, but I would like to get the incremental duplicate fetch tech working in the main duplicate filter fetch routine and the 'show some random' button too. I'd love to reduce the fetch time when I only need a handful of pairs, and if I can get this all working on the same unified and sane pipeline, I can finally think about changing the order pairs arrive.
>>23537 I'm glad you added that fraction beside ratings option, but in darkmode it's unreadable. it shows up for me as black text on a dark grey background. Is there a way to change the color of the fraction?
>>23538 I figured it out. for anyone else having this issue, it looks like the "border for liked" color is reused for the color of the text. it's a bit confusing because I'd usually want that to be a dark color, but I guess I gotta make it light if I want to see the fraction. not too big of a deal, just a bit strange.
Hell yeah rule34.xxx downloader is working again. The site has horrible tagging, but hosts stuff that gets struck down from other boorus. >>23537 TY dev. Quick question: how does Hydrus currently handle duplicates? If I subscribe to the same artist on derpibooru and e621, how is that handled? Or is there automatic md5 deduplication, and the duplicate filter is only for duplicates that don't share a md5 hash?
I found a bug with version 628. it looks like in the archive delete filter, the top right media viewer hover isn't allowed to pop up, so you can see it and click on stuff like links and ratings. it's only in the archive-delete filter, and it happens even when you have to check box to stop the top-right from popping up disabled. notes do still pop up though
>>23540 yes. if files are hash-identical (i.e. all same bytes and size), only one copy is stored. otherwise you have to run a job to find files that are visually identical/similar (duplicates processing) and then decide what to do with them (duplicates filter)
Maybe of interest to image-file-format enthusiasts: People are sharing images with a special embedded 'chara' exif metadata field. These can be loaded into many AI/LLM programs to use that personality. There are many, many image sites hosting these things. All of them are flakey and it's a good idea to start archiving cards you use. Hydrus currently can filter for files with 'embedded metadata', but there's no way to find all files with the 'chara' embedded text. The data is encoded too, so viewing it requires manual external action anyway. Maybe important to note is that this data most often includes NSFW text even if the image itself is tame. I've also seen embedded links to nsfw images, with some UIs autoloading these. Of course there's also ones with legally questionable text. specifications for the exif data: https://github.com/malfoyslastname/character-card-spec-v2/blob/main/spec_v1.md (including description of embedding method) https://github.com/malfoyslastname/character-card-spec-v2/blob/main/spec_v2.md 4 example images, hopefully
>>23542 TY >>23543 Interesting. ComfyUI and other AI generation programs also embed AI workflow data into .png files. I feel like this is a bit out of scope for hydrus directly though. It's built for booru scarping & management. I wanted to use it for doujins or movies, but it isn't built for it. Ideally if you wanted to use hydrus for AI template management, it would need a specific downloader for that site, and you would run it in a separate instance IMO.
>>23544 >out of scope >It's built for booru scarping >The hydrus network client is a file-management application written for internet-fluent media nerds who have large file collections. Sorry for sharing the downloader I made I guess. I didn't know I was using hydrus in a way some anon didn't like.
>>23543 we've been over this last thread, including Stable Diffusion and ComfyUI embeddings.
(66.64 KB 626x437 03-20:07:40.png)

Hydev, would it be possible to run a parser/downloader import from within hydrus? Like I have this pic anon posted just now and it would be pretty neat to be able to actually import it from a right click menu or something.
>>23543 >>23545 As someone who has recently started collecting character cards from Chub and JannyAI, I find this extremely useful. I've been adding mine to hydrus with local tags by hand. Making downloaders it beyond my knowledge, so I for one am glad you shared it. This will save me so much time.
>>23551 Oh, turns out you can drag and drop them onto lain and it works. I thought that the importer menu was one of the menus that blocks all other inputs.
>>23545 Sorry if I wasn't clear. It's cool, but I'd only run the AI card downloader in a dedicated instance, because it would be too easy to get it mixed up in random booru images. Also it wasn't clear you made a downloader, I thought you just posted the metadata spec. Sorry if I'm retarded!
Is there a way to blacklist combinations of two or more tags? Example, I want to blacklist files that contain both: 'creator:1' AND 'badtag' together, but I want to let files with 'creator:1' OR 'badtag' through. Thank you! But
>>23508 Yeah, on the main github repo here: https://github.com/hydrusnetwork/hydrus/pkgs/container/hydrus It builds every week with the other builds. I don't know much about Docker though. There's a guy who uses it for loads of stuff who figured out the Dockerfile build scripts here: https://github.com/hydrusnetwork/hydrus/tree/master/static/build_files/docker >>23513 Yes, thank you. Something is messed up when re-loading a fresh mpv window from stasis in that mode. I'll give it another go. >>23514 Are you completely sure the PNG here doesn't have an ICC Profile too? I did a bunch of work to get gamma&chromaticity pngs to render with correct colours about a month ago, and while a lot of this code is related, I don't think the actual file test overlaps. My 'has icc profile' flag is calculated basically by loading the png with Pillow and then seeing if Pillow provides an icc profile in the 'info' of the loaded object. If a png only has the gamma&chromaticity stuff, it doesn't have the 'icc_profile' info. If a png has ICC Profile bytes, I apply it if the user is set to currently apply icc profiles. If a png doesn't have an icc profile, or if the user has set not to apply icc profiles, I then see if it has an sRGB chunk, in which case I do nothing since hydrus is sRGB, and then I see if it has gamma&chromaticity, at which point I apply that. I'm thinking of adding flags like 'has sRGB chunk' and 'has gamma & chromaticity chunks', so we can search them, but I don't think they overlap with the old ICC Profile stuff. Maybe they overlapped in the past? What happens if you find some of these pngs and do manage->maintenance->determine if file has an icc profile? Do they fix themselves? In any case, if you have weird files, I'm interested in seeing them. If you can point me to some, or post a catbox zip or email me or whatever, I can check them on my end. That said, much of this code is a mess with old debug hooks in it and I'd like to rework it. If a user has 'apply icc profiles' off, then any thumbnails are generated with wrong colours, which I don't want. I want all those options to happen higher up, closer to the user flicking back and forth different render modes. I've also never been completely fluent in how these systems are 'supposed' to work, either, so feedback from users like you is great. >>23530 You have to do it with sidecars at the moment. Add a sidecar to your file export and then select 'time' as the source for the sidecar. It is awkward like the rest of sidecars, but you can do file or domain modified time. I do want to completely overhaul export filename phrases to handle stuff like times (and more gracefully handling non-present data). If I screwed my head on straight, it'd probably inherit most of the sidecar toolkit.
>>23534 I don't think I changed anything. There's a setting under options->gui pages->navigation drag and drop that lets you always move the focus to the primary 'text input' (like the autocomplete dropdown) of a page on page change, but I don't think I've ever had it for the media side of things. Afaik Qt just remembers whatever the previous focus was, although if you are using Ctrl+Tab there might be situations where the focus moves to the page tab itself. I added some shortcut actions to the 'the main window' shortcut set for 'move page selection left/right', that you might like to try mapping to something else, if there's Qt focus remapping going on in your OS. I think Ctrl+M by default moves the focus to the thumbnail grid, if that's helpful. You can remap that too, also under 'the main window'. I often do Ctrl+M->Ctrl+A for stuff. >>23538 >>23539 Thanks, I'll look into it. It should render the same as other text there, I bet some pen colour isn't being reset amidst the drawing routine. >>23540 Here's some formal help on this: https://hydrusnetwork.github.io/hydrus/duplicates.html And this is the thing I've been working on a lot this year. It isn't ready for normal users yet, but we are getting there: https://hydrusnetwork.github.io/hydrus/advanced_duplicates_auto_resolution.html That first help page is pretty old and patchwork now. Since you are new to the system, please let me know if anything is confusing or if any of the screenshots are so crazy old they don't make sense etc... >>23541 Yep, sorry--this is driving me nuts too. Something went wrong with the preview viewer top-right hover integration. I will absolutely fix this for next week. >>23543 Thank you, very useful information. I've got these images now. I plan to expand the 'human-readable embedded metadata' recognition to have a db cache, so you'll eventually be able to search for common stuff (and the same system will probably let you search for EXIF tags too), so these are great to work with so I know what I'm aiming at.
>>23553 Haha, didn't know this worked! The copy/paste bitmap is also a good quick solution. I've thought about recognising these better. I may, one day, since there's a bunch of other gubbins you can export to pngs. My whole serialisation system works on it, and most multi-column lists with import/export/duplicate buttons should support a similar drag and drop import. It might be nice to have a generalised 'hey this thing contains a [tag filter]--want to import to [tag filter favourites]?' right-click menu or import panel. >>23556 No, sorry! The tag filter code needs an overhaul to improve performance and simplify some things. Once it is nicer, I may be able to add logical algebra like this, and wildcards are another highly requested feature, but I will be careful to say that this stuff is often a lot more tricky than you think, so I will not promise it. For now you pretty much have to have a separate favourite tag filter that applies to artists x y z and then sequester those guys in their own subscription with separate specific tag import options. Or, if we are talking only five files, set up a complicated file search that covers these bad files and then just ctrl+a->delete that search every month. Another planned overhaul is to separate the blacklist from the tag import options in every importer, which should allow for simpler import custom setup.
>>23541 This is now fixed in master. If you run from source, just git pull and you are fixed. Otherwise, it will be rolled into v629. Sorry for the trouble!
The derpibooru/ponerpics/twibooru parsers should probably replace "character:tank" → "character:tank (mlp)" "character:tofu" → "character:tofu (mlp)"
>>23561 >"character:tofu" → "character:tofu (mlp)" That's a rare one though. Tank is not.
>>23561 I use "character:(tortoise) tank" plus the MLP tag "show:(mlp) - my little pony"
>>23566 Oh, right. I thought I ought to be missing something.
v626. tag banners (on the top of the thumbnail) show both the ideal sibling and the original tag instead of the ideal sibling only. Is there a setting for that?
>>23560 yeah I do run from source and I pulled. Thanks for the fix!
>>23558 >although if you are using Ctrl+Tab there might be situations where the focus moves to the page tab itself. I added some shortcut actions to the 'the main window' shortcut set for 'move page selection left/right', that you might like to try mapping to something else, if there's Qt focus remapping going on in your OS. So I tried to do this, and I ran into a bug that's blocking me. For some reason, Hydrus refuses to recognize any shortcut that uses "Shift+Tab". Instead, it always just sees it as "Shift+Nothing". I checked with other software and it works fine there, so this isn't a problem with my keyboard. Hydrus also recognizes "Shift+Tab" when I'm using the default "Ctrl+Shift+Tab" to move left one page. (which btw, I can't find those default page-change shortcuts at all in the menu. I looked everywhere. I don't see them.) I don't know why Hydrus won't see the tab being pressed specifically when shift is pressed when I try to make a shortcut, but I can't change it due to that. The "focus to thumbnail grid" is helpful though, thanks! I didn't know about this shortcut.
>>23351 >>23352 turns out I was probably 15 or so versions out of date, which was causing the issue...
I had a great week mostly fixing bugs, particularly issues with the top-right ratings hover, export filenames, date parsing in the builds, and duplicates auto-resolution work timing. There's also improved AVIF support. The release should be as normal tomorrow.


Forms
Delete
Report
Quick Reply
Drag files here to upload or
click here to select them
No Cookies?
0