/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(45.49 KB 480x360 YjoL7xy2uA4.jpg)

Version 387 Anonymous 03/04/2020 (Wed) 23:15:55 Id: 13fa37 No. 13731
https://www.youtube.com/watch?v=YjoL7xy2uA4 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v387/Hydrus.Network.387.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v387/Hydrus.Network.387.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v387/Hydrus.Network.387.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v387/Hydrus.Network.387.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v387.tar.gz I had a great week mostly fixing things and adding and improving small features. all misc this week The 'sort files by' dropdown on all pages is now a button. It launches a menu that groups the different sort types, cutting the long list down into easier to navigate groups. Mouse wheel still works on it! Also, 'sort by framerate' is added. It just does a simple num_frames / duration calculation for now. Fps isn't surfaced in the UI atm, so I expect in the near future to add it to normal file labels and also to add a system search predicate for it. The options->sort/collect panel finally got its overhaul. Managing namespace sorts is much more sane, support for namespaces with hyphens (like 'creator-id:') is added, and you can edit the order in which they appear in the sort by menu. Some logic behind tag autocomplete lookup is improved this week. The way special characters like braces and paretheses are handled is better (for instance if you want to search for '[intensifies]'), and now hyphens and underscores are included in these special rules. Typing in 'blue_eyes', 'blue-eyes', 'blue eyes', or 'eyes' will match all of 'blue_eyes'. 'blue-eyes', and 'blue eyes'! Although the results may still be separate without tag siblings, you should never have to worry about searching an underscore version of a tag again. It will take up to five minutes to update your client to reflect the new rules. This is a simple change to a complicated system, so let me know if it fails anywhere, particularly in namespace or wildcard searches! Right-clicking on a page tab now shows a 'duplicate page' menu item. It simply makes a complete copy of the page (or page of pages) right next door! Numerical ratings (the ones with multiple 'stars') can now be set by dragging the mouse. You can click on 2/5 and drag up to 4/5 if you change your mind. The derpibooru downloader gets an update thanks to a user's submission. The 'no filter' search should work again. Also the new tvch.moe imageboard is added to the supported watchers (thankfully, it was compatible with an existing parser, so this was a quick job). full list - the sort-files-by dropdown is now a button that launches a nested menu. it still supports mouse wheel events. it should now be quicker to find what you want! - added 'sort by framerate' to regular file sort. it works for file search at the db level as well, when mixed with system:limit - under options->sort/collect, the namespace sort-by ui has finally had its makeover. it now has add/edit/delete buttons and up/down buttons for reordering how the entries will appear. it also deals with bad input better. furthermore, namespaces that have hyphens (like 'creator-id') are now supported in namespace sort (and hence collect-by dropdowns!)! - numerical (multi-star) ratings can now be set by dragging the mouse across the line of stars - added 'duplicate page' to the page tab right-click menu! it just makes a copy of the page or page of pages right beside it - system:everything will now always show up in non-query-page autocomplete dropdowns (such as in the file maintenance dialog) - wrote a maintenance routine to repopulate and correct the tag text search cache. it is possible to trigger this (though it is typically pointless) from the database->maintain menu - updated the characters that are ignored in autocomplete tag text search rules, which help skip over unusual characters and assist word-break discovery for searching for tags like '[intensifies]'. as well as the previous brackets, braces, paretheses, quotes, and double-quotes, now slash, backslash, hyphens, and underscores(!) are ignored. searching for 'bbb' will now match a tag 'aaa-bbb', and searching for 'blue_eyes', 'blue-eyes', 'blue eyes', or 'eyes' will match all of 'blue_eyes'. 'blue-eyes', and 'blue eyes'! - to effect the above change, the client will take a few seconds to a minute to update - the above tag text search rules now collapse contiguous unusual characters, or combinations of whitespace and characters, better - namespace and simple wildcard search inputs no longer have the tag text search rules applied to them, meaning you can now search for these unusual characters more specifically when desired - updated the derpibooru gallery search objects to use their api, thanks to a user's submission. this re-enables the 'no filter' mode - added watcher support for tvch.moe, which works with an existing 4chan-style parser - the 'add the ptr' help item now warns the user about the ptr's modern drive storage requirements (4GB download+files, 25GB db). the help files are also updated - I believe I fixed the sometimes crazy fast media drag-move that could happen in archive/delete and duplicate filters - fixed an old uncaught wx->qt issue with the simple downloader where editing the formulae would throw an error - fixed a bug in the 'move highlighted thumbnail' code in the rare case where the currently focused thumbnail can not be found
[Expand Post]- text input dialogs are now mostly wider - refactored some ui code, cleaning up core objects and import hierarchy - did some controller/gui refactoring, pushing on untangling things - cleaned up a bunch of no-longer-used import statements - misc ui code cleanup - slight rewording of database menu - prepped shortcuts system to ignore a window-activating click (for the media viewer filters), but can't turn it on yet as media viewer clicks are not yet fully plugged in next week Next week is a medium-size job week. I would like to get 'favourite searches' working, so you can save a particular page's search and then quickly load it up later wherever you like. I would like to add some default ratings services to the client as well, since they are easy for new users to miss.
>>13731 >Numerical ratings… This reminds me of the multi-state single-pip ratings. Are those a thing yet?
ooh, the 'favorite search' feature sounds interesting! can you make it so that you can append a search to another search instead of just replacing it outright? also favorite search folders would be nice. also, any eta on when the next big job poll is gonna be?
>>13733 oh and would it be possible to save sorts with searches as well in the future?
FIX E621! They updated, it doesn't work!
>>13735 Seconding this, please. Was puzzled when my e6 subs came back with nothing all day and then I went on the site and saw the update. You now have to click a "I am over 18" button :/
>>13731 have a question, is it possible in the sort by menu to have a few top level sorts? I mean favorites/things I use a lot? personally for me my common is duration and then has audio to pull anything moving out of a parse, as moving things tends to slow shit down to a crawl then file size, and pixels would be next if its possible, it could be an options menu thing with checkboxes next to them, it would just create a top level duplicate of the sort method rather then move anything around from the current sub menus. this isn't a need of any kind, but if it was possible/low effort it would be nice.
E621 has updated. They now REQUIRE user logins for content on blacklists. You NEED to add these features for any downloads to work.
i was going to link to a parser i made for the new e621 update, but it looks like someone already beat me to the punch: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/Download%20System/All-in-Ones/Single-Sites/e621.net%20-%20version%202020.03.06.png i haven't tested it yet, but you may want to consider putting this in the next hydrus update. (also, good lord, are 8kun's captchas always this insufferable?)
>>13739 It appears to function for me. It's downloading and not mass-ignoring everything.
>>13738 actually, i noticed that you can even remove blacklisted tags when you aren't logged in (there's a 'blacklists' link on the tab bar). you'll have to manually clear that out and send the cookies over to hydrus via hc or something else to get past those, i think.
>>13739 yes, I moved to endchan almost completely, but they constantly have issues apparently. hdev, can you make 2 links for endchan, one .net and one .org, apparently they fuck one of those up but the other works.
>>13732 Not yet. I haven't really touched ratings it a while. I don't like how they work behind the scenes, another thing to queue up for a rewrite. >>13733 >>13734 Appending a search is an interesting idea. I will start with a simple replace as that is easier. Please give that a go for a bit and let me know if there are still workflows where you think appending would be useful. Would 'folders' be for nesting favourite searches into categories? Now you mention it, I think this also makes sense, rather than having just a giant list. Saving sort is also a good idea, thank you. Part of this 'favourite search' stuff will be gathering all the info on a search page into one place that I can better serialise. I'll see if I can roll sort (and collect) into that object this week, but if not, I will do the simple solution and iterate on it in future. I was going to do the next big job poll in mid-February, but I burned out a few weeks before and I realised I am just overloaded with still catching up on things after the Qt switch. I have scaled back my schedule and am going to clear out small jobs as I have been recently for the indefinite future, including more code cleanup and fixing, until things are less insane on this end and I am properly ready to start something big and new.
>>13735 >>13736 >>13738 >>13739 >>13740 >>13741 Thank you for letting me know about this. This is being worked on by lads in the discord, I believe also the login/blacklist stuff. I hope to roll out comprehensive fixed parsers for 388. I recommend you pause your e621 subs (and hit 'retry ignored' on them so you can get what you missed once it is working again) for the time being.
>>13743 Thanks, I will update the links. I think they said on twitter their .net domain was playing up, dunno if it will come back.
Neither of the new parsers are working for me, parsers aren't finding a file to download but reading tags fine seemingly
>>13760 i think this is because someone made the 'original file' bit of the parser check for a _blank class in the <a> tag of the image file, but it seems e6 has since removed it, so if you removed that bit from the parser then it should work just fine
>>13763 Thank you. Removing the _blank did indeed fix it.
>>13749 Its back, but having the net and org will be helpful because apparently when shit hits the fan, they lock down their known/popular domains, letting people in though backdoors/unpopular domains.
>>13764 Has anyone had a go at getting Hydrus to use the e6 JSON API?>>13764
I had a great week. The client now has the ability to save and load favourite searches, so if you have a frequent 'inbox videos bigger than 20MB, sorted by duration' search, you can now save this and load it up on any page. I also fixed some bugs, and there should be a fixed e621 downloader. I am afraid I ran into some IRL stuff at the end of the week, so I am suddenly a bit busy. I am behind on my messages, and the release will be late tomorrow.
replying to older post, >13671 >I am afraid I do not know anything about Rust, so I do not feel I can talk very confidently about this. I do not want to ask you to do some clever work that I then do not know how to integrate. Can you talk more about it? Rust is a zero-cost compiled language like C, but has strong safety guarantees that lets you multithread things easily. In a lot of cases you can just swap out your typical iterator loop for a parallelized version and you're done. This makes it really really nice for modules in interpreted languages that you can call and get a nice perform boost. Because of it's safety most bugs in rust programs are logic errors not undefined behavior or race conditions. Python has a major issue with it's locking system (global interpreter lock) that has gone unfixed or is unsolvable, which makes it poor at multithreading. Things like hashing/phashing/anything cpu intensive can be greatly sped up. For integration you would just call cargo –release inside the library folder, and then copy the compiled files from target/release into your lib folder, and then import using python. It supports compilation for all the platforms you support. >I do not want to add more phashes to the database at the moment (the bottleneck in the dupes system atm is processing potential dupes rather than needing to find even more) Can you describe the bottleneck? Is it the string distances? I can parallelize that. >I would be interested in a simple python file that could take an image in PIL or OpenCV format that would then return the cropped SmartCrop version, or the respective coordinates. This is somewhat problematic because I need the raw bytes or raw file access to get those bytes, otherwise I would have to reconstruct python's format into rust's image format which is obviously going to suck balls. I'll look into it.
>>13771 >I'll look into it. I've looked into it, this actually may be pretty easy. At some point, the smartcrop rust library takes just a vector of RGB pixels and their dimensions. I can just toss out the stuff it uses before that, then I would just need the raw pixels from opencl/pil. I'll give it a shot, I may be able to gpu accelerate this with opencl as well but I'm not very hopeful about it.
>>13772 >>13773 >>13771 OK. I have it working with PIL sending the raw pixel bytes to rust's smartcrop and then in rust I have multithreading going. Except, there's a problem. Python is way too slow, specifically PIL and I'm positive opencv will be the same. It takes a second or two for a single image just to encode the pixels and then encode the bytes to send to the rust library. Doing a pure python smartcrop won't work, it's too slow. 2-3 seconds per image with pure python. It would take almost a week to generate 100,000 images. PIL/OpenCV passing in bulk to multithreaded rust would be the same. I can get smartcropping to be fast but you've gotta send filepaths, in batches.
(forgive if this reply suddenly spams four times, I am having trouble posting) >>13775 Thank you for these updates. The bottleneck is the human part. It is fairly easy to find a hundred potential duplicates, but having the human user go through them currently takes several seconds per file. My top priority here is to start integrating customisable automatic decision-making rules into dupe processing, so we can quickly clear out easier cases like pixel-perfect png versions of jpegs and then slowly extend that as we add better and more confident tools to do 'yeah, this is definitely a resize' determinations. I don't want to add crop or rotation detection until we have the current queue more under control. I can pass you raw RGB bytes, whatever is simplest on your end. Having a multi-second conversion is a bit slow, yeah. In my experience, OpenCV is about twice as fast as PIL to load an image. It also does pixel conversion stuff all the faster, since it is jumping down to a C++ dll pretty quick, and it sometimes uses OpenCL to do GPU acceleration. I am generally confident I can load, convert, and read out raw RGB values for most images in less than a second (since it happens in hydrus all the time), but perhaps there are additional encoding concerns you need that I do not understand. I am used to scheduling CPU-heavy jobs, however, so if we end up doing this and can get it working overall in reasonable time, say less than a second per file, I can just add this as a new 'file maintenance' job type and have the client do it in the background, one every ten seconds or so, and let clients catch up in a few months of relatively very light work. In this case, I would have no concerns integrating it as an optional thing.
>>13791 >I can pass you raw RGB bytes, whatever is simplest on your end. PIL strictly advises against this and IIRC opencv too, that's the issue. it has to convert from the C/C++ memory into python tuples. opencv and the like are fast because they store the data in memory as C structs, do operations on it, and never have to transfer that data to python and back. Then you get another slowdown converting all these python tuples back into rust or whatever language. Example from the PIL docs: >Accessing individual pixels is fairly slow. If you are looping over all of the pixels in an image, there is likely a faster way using other parts of the Pillow API. A convertion of RGB values into a C struct requires a loop over all the tuples. There might be a way to register a "c" type extension with opencv/pil as a plugin but that sounds really complicated. Unless I'm missing something. I don't image process in python that often. >so if we end up doing this and can get it working overall in reasonable time, say less than a second per file one option I can try is to convert the PIL smartcrop library into opencv to see if there is a sizeable speedup. that *should* lead to a pure python library with less than a second per image. with rust + filepaths I can get that under a second easily and then multiply that by cores, you would be I/O read limited. Maybe I can speed it up with simd/opencl so you don't have to multithread.


Forms
Delete
Report
Quick Reply