/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

8chan Karaoke Night!

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(20.80 KB 480x360 SvsHVu3xt6A.jpg)

Version 423 Anonymous 12/23/2020 (Wed) 23:40:54 Id: 4a7246 No. 15033
https://www.youtube.com/watch?v=SvsHVu3xt6A windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v423/Hydrus.Network.423.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v423/Hydrus.Network.423.-.Linux.-.Executable.tar.gz macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v423/Hydrus.Network.423.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v423/Hydrus.Network.423.-.Linux.-.Executable.tar.gz 𝕸𝖊𝖗𝖗𝖞 𝕮𝖍𝖗𝖎𝖘𝖙𝖒𝖆𝖘! I had a good week making some small fixes and improvements to finish up the year. This is the last release of the year. There is a large poll on what 'big thing' to work on next: poll Here is the poll on what large work to go for next: https://www.survey-maker.com/poll3310902xA574481e-102 You can vote on multiple items. Please don't worry about the seriousness of it too muchI have a good idea of what is likely to win already, and if there are obviously jank votes, I'll reserve the right to discount a resultbut I am particularly interested to know what is and is not popular further down the list. I'll take the results on the 6th of January. The end of 2020 has come quick for me. I still have the network updates 'big job' to do, so that's first for 2021 Q1, but after that, I will plan out and hack away at the top item(s) on the poll. If you were not aware, a team of users is doing great work managing the Github issue tracker for hydrus. There is also a process there for bumping issues with reaction votes, viewable for big jobs like so: https://github.com/hydrusnetwork/hydrus/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc+milestone%3A%22Major+jobs%22 I regret that am doing pretty terribly at consuming my Github work queue, but we'll see how 2021 goes. I'd like to start relying on that more to better prioritise my work. work is all misc this week The settings for autocomplete auto-search and character threshold have been moved from options->speed and memory to tags->manage tag display and search. They are also now service-specific! So, you can now set 'all known tags' or 'PTR' to autocomplete after, say, 5 characters but your 'my tags' to always get all results. I worked on the recently re-activated '*' and 'namespace:*' advanced searches, which were running slow in the new pipeline on larger clients. I have improved some situations, and I think I have reduced the worst case scenario, but some large clients will still have trouble. I am not happy here, nor with other namespace- and subtag- lookup speed across the program, so I have made a plan to make some clever new indices here. As I have simplified and cleaned much of the tag logic over the past year, I now see the next holes that I should fill. The new siblings and parents tag right-click menu proved a little tall after all, so I have reworked it to group items by all the services that share them. It is denser to look at, but I think it'll highlight unusual exceptions when you are trying to fix application. The 'this is the ideal' line is also removed. You can now sort a row of pages by name from their right-click menu. Advanced users: The parser edit UI's test panel now shows more data, and it formats JSON prettily. It isn't anything close to what a browser's developer mode will give you of course, but it is nicer. That panel also detects if you hit a jpeg or something and says so, rather than dumping garbage to the test panel or just throwing an error. full list - tag autocomplete searches: - the 'fetch results as you type' and 'do-not-autocomplete character threshold' options are moved from _options->speed and memory_ to _tags->manage tag display and search_. they are now service specific! - getting the raw '*' autocomplete is now hugely faster when both file and tag domains are specific (i.e. not 'all known xxx') - getting the raw '*' autocomplete is now hugely faster in 'all known tags' domain. this is likely still bonkers on any decent sized client that syncs with the PTR, but if you have a small client that once synced with the PTR, this is now snappy - the cancelability of 'namespace:*' and 'match namespaces from normal search' searches should be improved
[Expand Post]- 'namespace:*' queries are now much faster in some situations, particularly when searching in a specific tag domain (typically this happens in manage tags dialog) or a small-file client, but is still pretty slow for clients with many files, and I think some scenarios are still bananas. I am not happy here and have started a plan to improve my service domain caches to deal with several ongoing problems with slow namespace and subtag lookup in different situations - fixed an issue with advanced autocomplete result matching where a previously cached 'character:sam' result could match 'char:sam' search text - some misc code cleanup and UI label improvements in autocomplete - . - the rest: - the siblings & parents tag menu, which proved a little tall after all, is now compressed to group siblings, parents, and children by the shared services that hold them. it takes less space, and odd exceptions should be easy to spot - this menu also no longer has the 'this is the ideal tag' line - added 'sort pages by name a-z/z-a' to page right-click menu and tucked the sorts into a submenu - the parsing test panel now shows up to 64KB of what you pulled (previously 1KB) - the parsing test panel now shows json in prettier indented form - when the parsing test panel is told to fetch a URL that is neither HTML or JSON, this is now caught more safely and tested against permitted file types. if it was really a jpeg, it will now say 'looks like a jpeg' and disable parse testing. if the data type could not be figured out, it tries to throw the mess into view and permits parse testing, in case this is some weird javascript or something that you'll want to pre-parse convert - the dreaded null-character is now eliminated in all cases when text is decoded from a site, even if the site has invalid unicode or no encoding can be found (e.g. if it is truly a jpeg or something and we just end up wanting to throw a preview of that mess into UI) - the 'enter example path here' input on import folders' filename tagging options edit panel now uses placeholder text and auto-removes 'file:///' URL prefixes (e.g. if your paste happens to add them) - the 'fix invalid tags' routine now updates the tag row in the local tags cache, so users who found some broken tags were not updating should now be sorted - added –db_cache_size launch parameter, and added some text to the launch_parameters help about it. by default, hydrus permits 200MB per file, which means a megaclient under persistent heavy load might want 800MB. users with megamemory but slow drives might want to play with this, let me know what you find - updated to cloudscraper 1.2.50 next week I am now on vacation. I have some family Christmas things going on, and otherwise I am looking forward to grinding away at lategame X3 TC with some Wagner. I hope you can have a good break as well. I'll be back to normal in the new year, and 424 should be out on the 6th of January.
One thing I believe will be popular soon is lossless recompression via JPEG XL. Works really well, build via docker or use the (right now) slightly less perfect lossless mode on squoosh.app Is support for such mapping hashes of jpeg/png/gif->jxl part of the file alternates in the vote or are these only for grouping visual alternate versions of files?
>The settings for autocomplete auto-search and character threshold have been moved from options->speed and memory to tags->manage tag display and search. >tags->manage tag display and search. Where is this option again? I don't see it.
>>15043 I think that might fall under "Add ui for waifu2x and other file converters/processors", but I suppose that only mentions a UI and not managing the files created through converters. I think that adding support for lossless conversions would be a good idea, whether that be JPEG XL or webp or whatever else.
>>15045 >>15043 Yes, I think eventually the alternates will support file conversion mappings, although technically I think these may better fall into a category of duplicate. Ultimately as >>15043 says, hydrus will have internal workflow to auto-map conversions as dupes or whatever and do nice metadata merge, but until then, this will have to be a Client API thing and set up outside, with the same scripts doing conversions. As for JPEG XL itself, I am all for supporting new formats. As soon as Pillow or OpenCV (or another really nice easy to get public image library) support them, I can add support.
>>15046 Oh, to expand, for the first version of file alternates, most of the work will be back-end prep and UI to edit things, but I want two basic types of 'alternate' relationship: 'I am a xxx of file y', where xxx could be 'recolour', 'recostume', 'edit', or any other arbitrary label. You could have 'upscale' or 'lossless compression' or any other label you could think of. 'I come after this file' -or- 'I am in a file group, index n', so we can experiment with handling and sorting shorter paged comics without having to use 'page' tags.
>>15044 In more recent versions, there is a 'tags' menu off the main gui window's top menubar. I have been moving tag stuff, particularly when there are service-specific options, there.
I didn't see an option for this and I'm not really sure how complicated it would be, but I'd really like a tag-based importer for exhentai (ideally that would interface with their archiver to download full res copies). They're a huge repository of stuff and we already had a scare with them this year.
(118.05 KB 1343x712 45475.jpg)

>>15048 Now I feel like an idiot, not sure how I missed that. As for the option itself, its really confusing. I mean I think I get it but I'm not really sure. Basically I used to have mine set to auto search by 2 characters(my spelling sucks) before version 423, now can't even get it to do that. I set the threshold to 1 and I get nothing and I tried the check box next to it and I still get nothing so I'm not sure how this works anymore.
(111.35 KB 1098x718 47474.jpg)

>>15051 >>15048 Ok playing around with it some more, it works fine for the main search page, where you have the system searches along with the results and the favorites tabs but when it comes to the tag management page, where you right click on an image manage → tags. The auto complete doesn't work as well.
What will multiple local file services add that we can't already do with multiple databases or just tags?
>>15045 >I think that adding support for lossless conversions would be a good idea, whether that be JPEG XL or webp or whatever else. Yes. Although you might see why JPEG XL is particularly noteworthy (note the tabs for the other sheets at the bottom): https://docs.google.com/spreadsheets/d/1ju4q1WkaXT7WoxZINmQpf4ElgMD2VMlqeDN2DuZ6yJ8/edit >>15046 > Yes, I think eventually the alternates will support file conversion mappings, although technically I think these may better fall into a category of duplicate. Could be. How will that look on the PTR if some users (/some *boorus or CDN or whoever) re-compress everything? > As soon as Pillow or OpenCV (or another really nice easy to get public image library) support them, I can add support. Very nice, thanks! Let's see how quickly that happens,
>>15049 This is more than a year old, and I have not tried it myself, so I cannot say how good it is, but there is an e/exhentai downloader here to try: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/Downloaders/e-hentai.org%20and%20exhentai.org%20-%20version%202019-08-15.png I am not an exhentai user myself, and due to the pain in the ass login (basically only solved for us with Hydrus Companion), I have shied from including a default downloader for all users for the time being. Perhaps with the next version of the downloader engine, if there is a neater solution or just better 'here's how to make this work' UI, I'll likely fold it in, as it is a popular request.
>>15051 >>15052 Thanks. Yeah, I regret this isn't more user-friendly. I have been increasing the technical side of things here, with more service-specific and service-cross-referenced changes in behaviour, but not much text to say what is going on. The tabs in manage tag display and search refer to the different 'tag domains' an autocomplete can be in. The two buttons below the tag text box are for the file/tag domains. You can now customise different search options for 'all known tags' vs 'my tags', if one of those is very large and slow vs a small fast search. Most normal search pages are 'all known tags', whereas manage tags dialog tends to be focused on the actual service you are typing in (although again that dialog lets you override that if you have clever needs). I'll make a note to describe this better in the help and the dialog UI.
>>15055 The nice thing about it vs tags is it will be a cleaner compartmentalisation, particularly for the nsfw/sfw/'my sister is looking over my shoulder' problem. When a file service only stores sfw files, then, as an example, you can type whatever into a tag autocomplete and you won't get undesired results. This would be true for many other situations where recommendations and 'similar to' and random 'system:limit' searches may provide surprises or just a wave of false positive results on your little art reference collection. For vs multiple clients, this allows you to sync with the PTR (or have shared tags of any sort) on only one client, which saves a ton of space and CPU. All other stuff like backup routines and bandwidth options etc… are shared too, and you can have files on multiple domains as well without needing more space. I'll be interested to see what new workflows emerge as well. Maybe in a couple of years we'll be putting all 'inbox' files on their own file service or something.
>>15056 That google doc is really interesting. Seems like JPEGXL with the right parameters is great across the board. The issue of dupes and their metadata propagating to the PTR or other shared resources is a worsening and unavoidable problem. It has only worsened as the big CDNs have yeah started optimising content. I am not sure if there really is a nice solution, and it may just be an architectural problem with the 'tag repository' idea. Perhaps a future version of the repo will also include hash-dupe pairs, but I suspect that would produce its own set of management problems as it would require user agreement on dupes, and not all dupe merges want all tags copied. My secret dream is that the huge 'corpus' of tag metadata we have built up with the PTR so far lets us train ML in the next five years and the scope of the PTR is greatly reduced. Rather than sharing 'file x has tag y', I'd love to share 'tag x looks like y visual data', and then clients will run ML on their own files for common/simple tags. Whether the tech can ultimately do this to a high enough confidence to apply a certain subset of tags automatically isn't obvious yet, but the feedback from hydrus-dd, which uses the DeepDanbooru ML, is promising.
I had a good work week. I did a variety of small fixes and quality of life improvements, and I finished redesigning the part of the database that does wildcard tag searches. Autocomplete lookups and file searches that rely on complicated tags are running faster across the board, and the old design's sudden lag spikes (e.g. with namespace:*anything* searches) are entirely eliminated. The release should be as normal tomorrow. There will be some database work on update. I will have a better idea tomorrow, but I estimate users who sync to the PTR with an SSD can expect it to take 5-15 minutes.
>>15059 I did already try that, however it seems to only accept individual galleries (and at 'browsing' quality, at that). If I had to seek out each gallery and check for new ones myself, it's not really much of an upgrade. I expect to get something downloading via the archiver would be a pain if for no other reason that there's multiple pages and you would need to check that the user has enough gp/currency (people might also get upset if gp gets wasted). Not to mention the archiver downloader will be in a compressed filed that will need to be extracted. Still, in principal it should be possible. I and I'm sure other people can help answer questions about exhentai. It would be a big feature IMO.
>>15062 >Seems like JPEGXL with the right parameters is great across the board. Exactly. Matches my own tests on the current Jpeg XL reference. Exciting times. ImageMagick got jxl in 7.0.10-54 >My secret dream is that the huge 'corpus' of tag metadata we have built up with the PTR so far lets us train ML in the next five years and the scope of the PTR is greatly reduced. Yea, that would be nice. Still might leave artist name, url and so on. Plus just imagine how many more images people might produce via AI crunching / pre-producing images. I wonder if the db of known images (plus known tags - future training data?) will be obsolete all that soon.


Forms
Delete
Report
Quick Reply