/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(18.87 KB 480x360 3k5KWFnnXK4.jpg)

Version 323 hydrus_dev 09/19/2018 (Wed) 22:38:26 Id: e3ce6d No. 10032
https://www.youtube.com/watch?v=3k5KWFnnXK4 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v323/Hydrus.Network.323.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v323/Hydrus.Network.323.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v323/Hydrus.Network.323.-.OS.X.-.App.dmg tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v323/Hydrus.Network.323.-.OS.X.-.Extract.only.tar.gz linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v323/Hydrus.Network.323.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v323.tar.gz I had a good week. I finished a decent first version of the downloader easy-import and did some other stuff. easy import There is now a simple new dialog under network->downloaders->import downloaders that lets you import new downloaders just by dragging and dropping an encoded png onto it. It tries to do all the semi-complicated comparison and association work to actually build a new downloader clientside automatically. Another dialog, under network->downloader definitions->export downloaders, is for advanced users who are comfortable with the new downloader system to put these new pngs together. These pngs should be shareable in any way any other png image would be–you can post one in a thread, or in a program like discord, or to a longer-term storage. I've written some basic help here for both sides: https://hydrusnetwork.github.io/hydrus/help/adding_new_downloaders.html https://hydrusnetwork.github.io/hydrus/help/downloader_sharing.html In the coming weeks, I suspect the advanced users will start to post these easy pngs in one of these locations: https://hydrus.booru.org/ https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Download%20System But maybe we'll start an 8ch thread for them as well. I am not sure what the details of the new system's overall workflow will actually be–e.g. will we need to update some parsers every few months, or will almost all of them last ten years? are there a hundred sites people want parseable, or ten thousand?–so please let me know your feedback. I could see a future iteration of this system having version control and an 'rss feed'-like way of automatically checking for and updating downloader data, but this first one is scrappy pngs on imageboards. I'll post an example importable png in the 8chan release thread so you can try the import out right now. While I am interested in helping users learn how to create new downloaders, and I'll fold some good new defaults into the client, I do not plan to write a significant number of these new downloaders. Now the system is completely user-editable, I am free to worry less about the minutiae. Please let me know how the system falls short, and I can put my time more into that. misc The deadlocked network issue users with extremely busy clients (200+ simultaneous downloads in the queue) were seeing should be fixed. I used a semi-hacky solution, basically culling max active jobs to 50 or so with a new check, so if you are one of these heavy users, please let me know how you get on. If you find a new bandwidth-based deadlock or your CPU usage spikes, I can revisit this. The watcher and gallery downloader pages now support right-click on their lists! I've thrown in some simple stuff, like pause/play and copy url/query text, but let me know what else you'd be interested in doing here en masse. I expect I'll add a 'retry failed' next week. Derpibooru gets a new 'no filter' search entry this week, which should search for all content (rather than just that allowed by their default filter, as a user helpfully let me know this week). The solution I am rolling out today appears to work, but if you find it gets confused and falls back to the default in some situations, let me know and I'll do a more technical solution that we have more confidence in. Gfycat should also support file page drag and drop import! Just drag from your browser's address bar to hydrus, and it should grab an mp4. The issues with unusual tag sorting in the media viewer/manage tags dialog should all be fixed! All tag lists are now sorted with the same code, and incidence-based default sort will still propagate up to the media viewer correctly.
[Expand Post]full list - wrote first version of the new downloader easy-import drop-panel. you drop downloader-encoded pngs on it, and it maybe asks you a question and jumbles its way through auto-importing all the required data to the client - extended this file import to do some cleverer 'example url merging' when parsers are otherwise dupes, rather than spamming similar dupes on import - wrote first version of the new downloader export panel. it takes gugs, url classes and parsers, and predicts sensible sub-objects to include to make functional downloaders, and bundles it into one png - fleshed out help for the new easy import/export system - the client now slows down gallery and watcher processing when the network engine is under heavy load, aiming for no more than 50 jobs in system at once. the solution is a bit hacky for now, but it should alleviate the deadlock issue when there are ~180+ simultaneous gallery/watcher network jobs pending - the multi-watcher panel's list of watchers now supports right-click menu to copy/open urls and pause/play files/checking - the multi-downloader panel's list of downloaders now supports right-click menu to copy query texts and pause/play files/searching - added a 'derpibooru tag search - no filter' GUG that disables the default derpi no-explicit-files rule - added basic gfycat support to default client–drag and drop any typical video page, and it should import ok - fixed the canvas/hover window tag sorting discrepancy–all tags are now sorted with the same code, and the media view sort order should be the same as your default sort order (although in this case incidence has no effect as there are no tag counts) - rewrote the network job control's cog menu to be a bit more dynamic, and added 'override gallery slot requirements for this job' if appropriate - fixed a stupid typo bug in the shutdown maintenance jobs test code that was causing pending repository work to not report right - fixed gallery searches that include unicode characters that end up in the path of the url (rather than the query parameters) - fixed an issue where highlighting a watcher would unpause its checking - generalised the way the new listctrl class can produce right-click menus - fixed some api link calculation that was over-prescribing api link display pairs (this affected the artstation file page url class by default). these pairs are now also sorted in the links dialog - misc png-export improvements to present better with the new easy import/export stuff - the summary texts in the tag filter panel now ellipsize (…), so if the tag filter is complicated, it won't try to boot a superwide edit panel! - the manage subscriptions panel now correctly initially sorts in a case-insensitive way (previously, it was usually sorting A-Za-z, which is different to regular aA-zZ resorting behaviour, so it always sort-flickered after the first edit) - the status bar has a new segment for reporting when the client is 'busy' with different jobs. for most typical usage, it'll just stay blank. let's see how it goes. - fixed mr. bones's wild review when the client currently has no files - punched up the new file report mode to specify full paths where available - improved some misc downloader code next week If the downloader easy-import fails in certain common ways, I'll give it a second pass. Otherwise, I'll be focusing on triaging the final downloader overhaul jobs.
(6.35 KB 512x149 realbooru tag search.png)

So, fingers crossed, if you drop this on Lain in that dialog, you'll get a new search entry for realbooru. Let me know if it doesn't work!
Is there one master window where I can make all of the pieces needed for a new downloader and put them together or do I have to do them in pieces and reference them to each other?
Just want to make a note with derpi parser, it's only doing 15 images at once, I know with login options Im able to get 50 images at once, is there a way to get 50 images from it without login options? Honestly just thinking of the vast backlog I want from there when I say this, because more links per scrape means less network shit overall for the program to do. on the fix for the clogging, there is no way for me to tell without either shit hitting the fan or making a new session and just mass dumping shit into them. I should note that while i'm adding even a few links to the network gallery tabs, shit is going demonstrably slow. I am thinking of using the new highlight options to make new windows for them so they aren't 8000 parsers big, should also free up some ram doing this too. I got a few options once I deal with the after math of the tornado and the program not working right for a week+.
(127.46 KB 1920x1040 client_2018-09-20_15-53-05.png)

Pixiv artist lookup isn't working for me. I did login to pixiv and tested it to see if it still works and it says OK, so something else must be the reason.
>>10035 my client has been in a near state of complete lockup since update after downloading some things from gallery, I am unsure what is causeing it, I will find out later on, if its watchers or gallery once everything is settled, any of the debug modes I should enable to get you a clear idea of whats going on?
>>10038 I am having the same issue aswell.
>>10032 https://hydrusnetwork.github.io/hydrus/help/getting_started_subscriptions.html The subscriptions help page lists services->manage subscriptions as the path to get to the subs window, but it's in network->downloaders->manage subscriptions. Presumably this got moved in an update and the docs never got updated? Also, the new unfiltered derpibooru downloader seems to be working fine so far. Will report back if I run into any issues.
>>10033 I can say it worked on Gentoo. Saved image, dropped on Lain, got question on what to import [which BTW I'm not sure I could investigate properly at that point -malicious or and maybe cluttering up other rules, it's gonna be "ok"], and it works.
>>10040 little bit of experimenting, 4 gallery pages, 4 watchers downloading the galleries were not horrible, but adding the 4 watchers that was bad Moving from the watcher page (the highlighted watcher was not one downloading) to another page, it took a bit of time to but once there everything became quite a bit more responsive, though that could also be because some of the watchers at this point have finished Going over to the gallery page, again with a gallery highlighted that isn't downloading, the problem came back, while it's still more responsive than with the watchers, its still laggy, moving over to another tab not watcher/gallery related at all, and the program feels like nothing is going on despite 4 watchers and 500 images between them so far (should hit little over 1000 by the end.) Conclusion I can draw something is going on when you are on a gallery/watcher page, even if the watcher/gallery you are downloading inst highlighted, that gums the whole system up. this could be a consequence of having so many watchers/gallery pages, it will be a while before I can rule them out as a reason.
Ok, going through my list of watchers to harvest shit from when the tornado took power out long enough to cause the network fullstop now, and I came across something I think needs to be addressed. One of my watchers is from 2 months and 24 days ago, now I'm thinking 'no fucking way' so I open it, and sure enough, its still going though it hit the comment bump limit, but I notice its a file short, so I hit check now. It takes so long to check that I think something fucked up, either by so I load up the thread and as I get to the bottom, the threadwatcher goes from 249 to 249/251 so, the problem is there is no feedback that something has manually been checked, and no feedback that it's in the process of trying. first thing, in the status window a manual check/ checking should be there when the page is up for the check, either manual or the automatic ones, just for some kind of feedback that its happening. Now, I also think a 'last check' time would also be a great thing to have. as you can imagine, sorting by date when I have a 2m24d thread puts said thread way at the fuckin bottom, but a last check time would allow someone like me who is lazy sort through threads in order of when they died. it may not be the most useful metric, but I can see uses for it, with that said, if this metric did get added, I would make it an optional one as many people may never see a use for it.
Is there any possibility of making thread watchers compatible with 2chan? It's probably not an immediate priority, since I imagine the Hydrus userbase is primarily English speakers, but I'd find it useful. Actually, is there a way for me to add it myself? And, if so, how would I go about doing it? If it's something that very few people are going to use, it would probably make more sense for me to just do that.
>>10051 The 'pages -> new download page -> simple downloader' does the trick for me for downloading everything at least.
>>10034 The latter. There's three main objects–GUGs, URL Classes, and Parsers–and three dialogs under downloader definitions to manage them, and then manage url class links to explicitly join url classes to parsers. Please start here if you'd like to learn about creating in the new download system: https://hydrusnetwork.github.io/hydrus/help/downloader_intro.html
>>10035 >>10040 >>10045 I hope the login manager will solve the 15->50 sorts of problems. Several boorus and sites have username-based custom filters and blacklists and all that, so that's the direction I will try first (rather than trying to write some dynamic data/ui to try to interface with each different website's options clientside). The downloader usually doesn't care how many results it parses, especially for sites like derpi that indexes gallery URL by page number. r.e. Your busy client, is the new status bar indicator at the bottom-rightish always saying 'very busy'? If you hit help->debug->data actions->review threads, are there several hundred in there doing different jobs?
>>10038 >>10042 Thanks. I am working with some other users on this right now. It looks like they moved their artist galleries over to the dynamic phone-friendly layout. We know the basic solution, so if it works out, I hope to roll out updated parsers next week. If someone else pre-empts me, it may appear earlier, likely at the github linked in my release post.
>>10043 Thanks. Yeah, a lot of my downloader-related help is now out of date due to the overhaul. I hope to catch it up now I am winding it all down (and the ui is more fixed etc…).
>>10051 It is completely possible to write your own 'watcher' parser, although this is a semi-difficult parser to write. For now, I recommend you use the simple downloader as >>10054 suggests to fetch linked images or whatever and then maybe have a look at the downloader help here: https://hydrusnetwork.github.io/hydrus/help/downloader_intro.html And write your own. I don't know if 2chan has an easy json api like 4chan/420chan/8chan do, but you might be able to adapt one of those existing ones. I am happy to help you learn the system, and the discord (and that github) also has a bunch of users who are getting into the meat of the system. If you do make one and get it to work, please post it here or the github as one of the new pngs. I am sure there are other users who would like 2chan, either now or in the future, or would be able to use your work to figure out their own html thread parser for some other site.
>>10042 >>10038 >>10057 Ok, looking at this again, the japs have fucked it up more than I thought. We have a way to get the data we want, but it is out of order. I am working on this problem.
>>10032 Is there a way to force the client to show a certain selection of results while using system limit? Specifically, let's say I have 1000 files with a tag and a system limit of 500. When I search that tag, it will display 500 (random?) results for that tag. Is there then a way for me to force the client to show the 500 it did not already show me? My example is a little simple. Realistically, I have 3000-6000 results for tags and loading those all up at once makes for a pretty heavy session. Thus I'd prefer to do it in "small" bites if at all possible.
>>10056 Ill get something tomorrow however I also found a potential issue with subs that either i'm stupid and the fix is there, or an option should be put in to it. I set up a derpi no filter with this first_seen_at.gt:3 days ago, upvotes.gte:150 That would filter anything old out, and filter anything under 150 upvotes out for the most part if an image hits this list, I would usually download it anyway so get them all and sort them later. however, here inlies a problem, It appears that it wont research the pages it already had, so it was only getting a few files that present on the last page each time. But a gallery import that is set to the same, and set to only show new files, managed to show 12 files that were new Its possible all these files were boosted up to the 150 mark in the 6 minutes that was in between the finishing of the I want to say 15 seconds to 2 minutes between the finish of the sub check and the beginning of the gallery search, im not sure, it will probably take me a few days to really see what's happening or if i'm completely wrong, could go either way im not sure.
(6.40 KB 534x55 2018-09-23_19-03.png)

So I still have the problem that I described a few threads ago. Some GIF's simply don't display correctly. At that point you had mentioned something about having enabled PIL to load gifs in the media options. I don't nor did I ever have that enabled, yet some GIFs still display corrupted for me, as I've described here: https://8ch.net/hydrus/res/9608.html Could you take a look at this again? Is OpenCV maybe not bundled/detected properly? I do have it installed on my system (Arch Linux).
(50.32 KB 457x472 YZ3N43y.png)

>>10066 Attached is a screenshot of my help->about dialog.
>>10066 >>10067 Also is there some kind of way to add threads to specified watchers programmatically? It doesn't have to be complicated. Something I could imagine is just calling hydrus-client (when an instance is already running) with certain arguments. For example: `hydrus-client add-thread-watcher <id> url` It then checks if a Hydrus instance is already running, if so this instructions is forwarded to the running instance. Thread watcher pages would have to expose UIDs for this to work. Just something to think about.
(92.52 KB 1816x1165 dwm_2018-09-23_13-11-25.png)

(115.21 KB 1816x1165 dwm_2018-09-23_13-13-28.png)

(217.87 KB 2338x2024 client_2018-09-23_13-14-18.png)

Here is a before and after, its a bit hard to get because the client is locking up for seconds to minutes at a time, There's not exactly a whole lot of change, so maybe something more there that I don't see.
(217.87 KB 2338x2024 client_2018-09-23_13-14-18.png)

(351.09 KB 2318x2024 client_2018-09-23_16-14-25.png)

(348.33 KB 2318x2024 client_2018-09-23_16-14-37.png)

(370.78 KB 2318x2024 client_2018-09-23_16-14-58.png)


>>10069 Ok here are some 59 threadwatchers dumped in note while the client is laggy and hangy, if I was still on the threadwatcher tab, its fuckin unuseable However, moving over to a different tab, while it still has I think the same load, isn't causing the client to be unuseable, just not fun to use, if this distinction makes much sense.
>>10071 That said, I have to ask, if you ever are able to make the client multithreaded, can you add an option to define how many threads get used? or possibly what aspects of the program get dedicated threads? If the client went multithreaded and decided to ping my cpu to 100$ it would basically be prime95 the whole time, that's just not something I could deal with.
>>10063 more to this I had the sub going for 1 day 12 hours now after the initial gallery download the watcher has picked up 64 images, however a gallery search that had the same parameters shortly after one of the checks came back with 64 new images.
>>10067 >>10066 Thanks for this update. My guess is opencv is sperging out due to some .so not loading right. I will write a 'image load report mode' in the next week or two that will specify which library is being used as images and gifs load and when failures are occurring. Are you running from source or a built package? Anyway, please run this report mode when it comes in and let me know what you see.
>>10068 Yeah, I'd like to write a client api at some point that will let people write scripts to tell the client to do simple stuff like it. It'll just be an http server you can talk to, with some simple authentication. This will obviously be a big job. It'll be up for vote on the next big poll.
>>10071 >>10069 >>10072 >>10076 Thanks. Although your threads are spiking there, it doesn't look horrific or out of place. My guess is this is ui-bound, particularly since looking at the busy watcher page is making things noticeably more laggy. I will review this code and see if I can optimise and slow the refresh rate based on the number of watchers being tracked. Although the client uses different threads to do its stuff, python unfortunately cannot use multiple processor cores simultaneously, so it never really is 'truly' multi-threaded. It can use multiple when it dips down to the C++ level (typically when it is doing low-level stuff like heavy math/file processing), but otherwise, it'll typically max out at 12.5% or 25% CPU. My threading code sucks as well (it does a lot of async shit synchronously, just waiting on results), which is a big cause of a lot of ui hang.
>>10087 One of the things you responded to was a different issue with a subscription. On the side of reducing load. If possible can you keep the top level things, as I like seeing XXX/XXXX to see how much work is left. Sometime this week im going to dump the entirety of thread backlogs into a saved session, I have a feeling the watchers are causing quite a bit of ram use even if they aren't visible, it will at least give me a point of comparison once I dump the somewhere around 8000-10000 threads watchers down to around 800 active.
>>10085 I'm running from source, specifically the AUR package from Arch Linux. Thanks for looking into it, and I'll run in it report mode when the time comes.


Forms
Delete
Report
Quick Reply