/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

8chan Karaoke Night!

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(9.25 KB 480x360 ZQSpPrZSDQ8.jpg)

Version 318 hydrus_dev 08/15/2018 (Wed) 21:39:12 Id: ea9080 No. 9695
https://www.youtube.com/watch?v=ZQSpPrZSDQ8 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v318/Hydrus.Network.318.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v318/Hydrus.Network.318.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v318/Hydrus.Network.318.-.OS.X.-.App.dmg tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v318/Hydrus.Network.318.-.OS.X.-.Extract.only.tar.gz linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v318/Hydrus.Network.318.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v318.tar.gz I had a great week. I caught up on more small stuff and put some more work into the new gallery downloaders. downloaders Unfortunately, tumblr decided to kill their 'raw' url access this week. There is not a lot of firm word from tumblr on the issue, but there is some scattered conversation around, and it seems it is dead and not to return. Whatever the ultimate reason for this, it broke our tumblr downloader, so v318 has an updated tumblr parser to fetch the 1280px versions of urls. I have also taken the opportunity to switch the tumblr 'gallery' parser over to the new system, so the tumblr downloader now fetches and associates neater 'post urls' for its file import objects rather than raw file urls and adds 'creator' tags. However, because of the URL changes, your tumblr subscriptions will hence hit their 'periodic' file limits and likely redownload some 1280px versions of files you already have in raw–if this affects you heavily, you might want to pause your tumblr subs before you update and carefully experiment and curate what happens after you are working again in v318. Note that some artists (like lmsketch) attach a lot of images to their posts, so if your periodic limit were 100–and that 100 now means 'posts' instead of file urls–you are potentially talking a lot of files that are going to be redownloaded! Again, I recommend heavy tumblr subscribers pause before update and maybe consider recreating their tumblr subs from scratch with an initial file limit of 10 or so. The multi-gallery download page has some more improvements this week. I've fixed an issue where the sub-downloaders on the same page were unintentionally all sharing some bandwidth rules with each other. Multi-galleries also now have an 'added' column. And the 'pause' ⏸ and 'stop' ⏹ characters used in the lists on the multi- pages are now editable, also under options->downloading, for those who have trouble viewing this unicode. I have also made the 'only get a gallery page every x seconds' option global to the whole program (it was previously per-downloader). Being able to create twenty new whateverbooru queries at once with a single click of the paste button is powerful and great, but it also meant spamming servers with many heavy gallery requests all at once, so now all downloaders share the same slot that comes up every x seconds. The delay option is under options->downloading. I recommend 15s for downloaders and 5s for subscriptions. Let's see how 'global' works, and if it is an annoying bottleneck, I will see about making it per-domain. Subscriptions now auto-compact whenever they sync. This means they delete old fully processed URLs they no longer need to calculate file velocity just to keep them running snappy. You shouldn't notice any change except maybe a faster-loading 'manage subscriptions' dialog. A couple of unusual data problems meant that xbooru and gelbooru were not searching well in the new system. I have fixed these, so if you got affected by this, please rerun your queries and let me know if you still have any problems. I also added gallery parsers for rule34.paheal and mishimmie (the paheal update should finally fix the 'paheal gets gallery urls in file results' issue!). Advanced users might like to refer to the gelbooru situation (and tumblr and artstation api gallery url classes) to look at url classes's new 'next gallery page' component, which lets you define an easy logical way to predict the next gallery page from a recognised gallery url and now acts as a fallback if the gallery parser cannot find one (as is usually the case with api results!). full list - downloaders: - extended url classes to support 'next gallery page' generation–a fallback that predicts next gallery page url if the parser cannot provide it (as is often the case with APIs and unreliable next-page-url galleries such as gelbooru) - integrated this new next page generation into new gallery processing pipeline - updated gelbooru, tumblr api and artstation gallery api url classes to support the new next gallery page business - fixed the url class for xbooru, which wasn't recognising gallery urls correctly - wrote new gallery parsers for rule34.paheal and mishimmie (which are both shimmie but have slightly different gallery layout). this should finally solve the 'one paheal gallery url is being parsed into the file list per page' problem - 'fixed' the tumblr parser to fetch the 1280px url (tumblr killed the raw url trick this past week) - misc text/status fixes - wrote a gallery parser for tumblr that fetches the actual tumblr post urls and hence uses the new tumblr post parser naturally! (tumblr post urls are now more neatly associated as 'known urls' on files!) - note that as the tumblr downloader now produces different kinds of urls, your tumblr subs will hit your periodic limits the next time they run. they will also re-download any 1280px files that are different to the previously fetched raws due to the above raw change (protip: keep your subscription periodic file limits low) - cut the 'periodic limit' subscription warning popup down to a much simpler statement and moved the accompanying help to a new help button on the edit sub panel - multi-gallery pages now have an 'added' column like multi-watchers - the new 'pause' ⏸ and 'stop' ⏹ characters shown in the multi-downloader pages are now customisable under options->downloading (some users had trouble with the unicode) - the watcher now shows the 'stop character' if checking is 404/DEAD - fixed an issue where the new gallery imports on the same multi-page were all sharing the same identifier for their ephemeral 'downloader instance' bandwidth tracker, which meant they were all sharing the same '100rqs per 5mins' etc… rules - the page and subscription downloader 'gallery page delay' is now program-wide (since both these things can run in mass parallel). let's see how it goes, maybe we'll move it to per-site - subscription queries now auto-compact on sync! this means that surplus old urls will be removed from their caches, keeping the whole object lean and quick to load/save - gallery logs now also compact! they will remove anything older than twice the current death velocity, but always keep the newest 25 regardless of age - . - misc:
[Expand Post]- the top-right hover window will now always appear–previously, it would only pop up if the client had some ratings services, but this window now handles urls - harmonised 'known urls' view/copy menu to a single code location and added sorted url class labels to entries (which should reduce direct-file-url misclicks) - greatly sped up manage tags dialogs initial calculation of possible actions on a tag alteration event, particularly when the dialog holds 10k+ tags - greatly sped up the second half of this process, when the action choice is applied to the manage tag dialog's current media list - the buttons on the manage tags dialog action popup dialog will now only show a max of 25 rows on their tooltips - some larger->smaller selection events on large pages with many tags should be significantly faster - subscription popups should now 'blank' their network job controls when not working (rather than leaving them on the old job, and without flickery-ly removing the job control completely) - the file cache and gallery log summary controls now have … ellipsized texts to reduce their max width - fixed an issue where larger 'overriding bandwidth' status wait times would sometimes show instead of legit regular smaller bandwidth wait times - removed a now-superfluous layer of buffering in the thumbnail grid drawing pipeline–it seems to have removed some slight lag/flicker - I may have fixed the issue where a handful of thumbs will sometimes remain undrawn after several fast scrolling events - gave the some-linux-flavours infinitely-expanding popup message problem another pass. there _should_ be an explicit reasonable max width on the thing now - added a 'html5lib not found!' notification to the network->downloaders menu if this library is missing (mostly for users running from source) - help->about now states if lz4 is present - gave 'running from source' help page another pass, including info on running a virtual environment - in file lookup scripts, the full file content now supports string transformations–if this is set to occur, the file will be sent as an addition POST parameter and the content-type set to 'application/x-www-form-urlencoded'. this is a temp fix to see if we can get whatanime.ga working, and may see some more work - if the free space on the db dir partition is < 500MB, the program will not boot - if the free space on the db dir partition is < 1GB, the client will not sync repositories - on boot the client can now attempt to auto-heal a missing local_hashes table. it will give an appropriate error message - misc post-importing-cleanup refactoring next week I still have a few more small things I want to catch up on, but it isn't so urgent now, so I'd like to get started on the new 'searcher' object, which will be the final component of the downloader overhaul (it will convert the initial 'samus_aran' search phrase into an initialised search url). I feel good about it and may have some test ui for advanced users to play with by v319.
>>9695 Thanks! Great work as always. Question about Tumblr, if I do recreate my subs with a limit of 10, will that just check for the first 10 duplicates (assuming a given sub has the same image for the 1280px as for the raw, due to small raw images) and then go to the next one, or will it check "freely" and only stop after actually downloading 10 posts worth of files it considers "new" and absent in the db? The real QoL winner here, for me, is the global request bottleneck. I might trim the times down a bit from your recommended after some testing, but this is useful not just for sparing large and powerful servers but to keep my cheap wifi card from melting onto the motherboard or micro-freezing hydrus if I have a lot of active downloaders going at once (which I always do).
Ok, playing with galleries a bit more, I noticed that the galleries were checking every 15 seconds, something I don't notice with 4chan. instead of 1 slot globally, would it be possible to make 1 slot per domain? of it that's not possible to chain the requests so that it doesn't take over a minute for an entry tells me that nothing is there… perhaps, because I want confirmation faster that I inputted what I wanted right, would it be possible to make unlimited initial gallery slots and subsequently what it has currently?
>>9695 Thanks, my dude. I'm the retard that was constantly complaining about gelbooru not fetching tags. This release made it magically work again and I didn't have to do anything. Seriously, I don't know what caused it but I could not get tags from gelbooru for months. It was horrible to use incomplete danbooru and slow as fuck san to get anything. I think it is inevitable that more and more things will get pay walled or account walled in the future so I hope that the login manager you have planned will be able to further restrict downloading to not accidentally trigger bans. May god bless your soul.
dicking around in hydrus a bit, and then acdsee… Holy shit I fucking hate acdsee The shit bloated to hell and back The older versions that were not bloated can't unicode And now half the manga I download I have to decompress and recompress Im so fucking sick of it but I am unable to move off it because its a complete package. If you ever do a manga version of hydrus all you need is 1) user searchable archive method. There may be a time when hydrus goes tits up and user readable files would be nessassary 2) potentially keeping series together in one place 3) archive view, as volumes are easier to deal with in archives than as files, and thumbnails from the image hydrus already eat 5 gigs more than they should, I would hate to see what happens with manga 4) user defined thumbnail for series, user defined thumbnail for archives (if they even chose to thumb that) and self generating thumbnails on open rather then images thumb archive, if only for the sake space and simplicity. potentially a non thumbnail view mode. even as bad as hydrus ran when I had 600k thumbnails open was worlds better than acdsee ultimate 10 runs normally, but I have to use it for unicode, thank fucking christ acdsee 8 (a 10 year old version) runs fast and most manga dont use unicode otherwise life would be never ending hell, but then again 8 goes into a failure cycle every now and then where it constantly crashes for half an hour straight before its good to go its shocking to me how close hydrus is to being the best manga reader Though I highly suggest when (If I remember right you have expressed interest in it) you make hydrus work for comics, you do it with a spinoff version.
>>9695 Tumblr pretty well fucked up everything, didn't it? I'm getting a lot of duplicate redownloads, as expected, and they tend to keep going for awhile. So I've paused my Tumblr subscriptions until, I dunno, things get unfucked on their end or I just do a big re-scrape again ( ;; -д-)
>>9695 >>9707 tags, btw are not downloading and I assume that's Tumblr's fault.
Ok I tried the parser you posted last time and everything broke with moving to 218 everything worked, no idea if that was because of the upgrade or if I needed to restart to make them work anyway, added the webm parser for derpi and restarted it didn't work, till I get further instructions, im going to leave it where it is
ran into an im pushing the program a bit hard problem. so I go to 4chan, I open all the threads on a page I want images from, copy them, and paste them into hydrus. This works just fine however, the program tends to hang for the 15-30 seconds and only becomes really useable again after all files are downloaded. now that in and of itself isn't an issue, the problem is that tooltip when you hold the mouse over a button for a while will pop up, and that message decided to stay on top of every single thing you want to do regardless of what you do. Is there an option to turn pop up messages like that off? If there isn't and you add one, can you add it to the help drop down rather than the options menu? reason being at some point I think I will want popups when I have no clue what im doing, but ill forget where they are if its just an options menu checkbox rather than a help menu toggle.
DAMN TUMBLR THEY PLAYED US LIKE A DAMN FIDDLE
>>9709 we still have no downloader for derpibooru right? man i got so much shit from there i just want to get it sorted out and move on to sort other things.
hydev, try to download these two pictures through the URL downloader https://e621.net/post/show/243115 https://e621.net/post/show/243249 that problem where the program recognizes different URLs as the same picture is happening again. these two are a good example of it.
>>9698 The tumblr downloader now produces entirely new URLs for its import objects, so it will consider anything you get as 'new' for its subscription-calculations. You can check it out with just a little test search on a gallery page–tumblr now gets 'post' urls rather than direct file urls as you'll see in your old tumblr subs. Furthermore, the actual file URLs that will be tested during the actual download-and-import phase are new, with some exceptions (we are back to the old 1280 format, so you may or may not have some of those still mapped at the db level from before we switched to raw). When we were raw, all the URLs were in raw format, so everything will appear new. >>9707 >>9708 Yeah, it is all fucked. I don't think it is coming back–a bunch of people sent requests to their customer service thing, and they got corporate bullshit back. The overall thing feels to me like a downsizing effort on their part, but I don't know for real. I wouldn't be surprised if they are deleting the raws right now to save storage space. They can't be making much revenue, and I can't imagine that Verizon has much love for preserving HD HomestuckXSherlock content. You don't want to be overwhelmed with dupes (at least until I have some better auto-de-duper in place), so I recommend cutting your tumblr subs' 'regular' file limits to like 10 just one time (with a 'check now' if you like to force it to go), so your tumblr subs will do one 'small' sync to catch up to now and the new URLs and then once all that goes through, you can bump the file limit back up and it'll resume, working on the new URLs. >>9721 The raws I've lost… the banter I've lost… won't stop hurting… It's like they're all still there. You feel it, too, don't you? I'm gonna make them give back our content.
>>9700 Yeah, I think I'll make it per domain. It hits parallel subs pretty bad as well IRL, even when they are only competing every 5s. Its hacky anyway atm, so I'll give it another pass. Rolling thread checking into it is a good idea. It probably isn't too heavy for 4chan/8chan to deliver the JSON, but we prob don't want to hit them with like 100 rqs all at once as soon as a client wakes from sleep or something. >>9708 Oh, you can get tumblr tags if you go network->downloader definitions->url class links and then match the 'tumblr file page api' with the alternate 'tumblr api post page parser - with post tags'.
>>9705 I like ACDSee for the instant image viewer and have it as my default app to launch jpg/gif/png. I think I have 12, which has the similar problem of superbloat as soon as you try to load the actual app. I sort of have a plan to one day spin out the hydrus media canvas code into its own exe for quick media rendering of regular files in regular folders, and then I'd drop ACDSee completely. I like your ideas for manga. Approaching that will be a big job, but I think worthwhile, so we'll see how the poll goes after downloader.
>>9709 It might be because of html5lib being missing, not sure if I mentioned that before. You can check under help->about to see if it loaded ok. That was a common cause for gelb tags disappearing (I think gelb has funky markup that lxml can't parse). >>9722 You should be able to drag-and-drop import Post URLs, and if you can arrange a bunch of Post URLs into a big newline-separated list in your clipboard, you should be able to paste them en masse into the URL downloader. But searching will have to wait for the 'searcher' object, which will be a few weeks. I will add full support for it naturally in an update.
>>9709 Sorry, I read that wrong. For derpi, can you check if you have html5lib, and if you do, can you go into network->downloader definitions->manage parsers and load up the parser and do a 'fetch data' and 'test parse'. Do you get some tags and urls back after a test parse, or nothing/only some? >>9712 Is this the paste button beside the text entry for a thread URL? What's the tooltip popup usually say? I think this might be the same issue as I say in >>9728 . I will make thread checks run on a slot as well, and we'll see if that reduces the CPU kill on mass-paste.
>>9726 Thank you for this report. I had no problem, so I think this might be a legacy issue where when we had this problem, you got some bad URLs associated and they are still hanging around. 243115: hash: f97dd2fcefa0ea7ead42d469718f50b1dd59b3d7c80050920c5ba1529e1f1f4c urls: http://www.furaffinity.net/view/8758805/ 243249: hash: 97e16d7197b93d9f9235bc18799ed08987be5e5f486bcd6bef85fbca4c769e5e urls: https://www.pixiv.net/member_illust.php?illust_id=43778628&mode=medium http://www.furaffinity.net/view/8758805/ If you open a new file page for 'all known files'/'public tag repo' and try searching system:hash=(one of those hashes), you should be able to access the one which isn't in your client and manage the known urls and see if there is an accidental conflict or something here. EDIT: The alternative possible problem is if you have a furaffinity url class from somewhere that doesn't expect FA post urls to ever have multiple files. They both have the same FA url attached, so if your client believes that FA urls can only ever apply to one file (as the user-made one here says, I see: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/NEW%20Download%20System/URL%20Classes/furaffinity.net%20post%20url.txt ), then the client may be inferring something inaccurate. Do you have an FA account? Can FA links like the 8758805 one have multiple files, or is e621 in the wrong here with their URL attribution?
>>9729 honestly, the main musts for manga is user readable files names/structure, in contrast to current hydrus where I think the images has its hash as a name. Another aspect that would be hard would be how do you sort the manga or even present it. Personally with acdsee I use detailed mode and one massive fuck off download folder, use to sort manga by series but that got so fucking tedious I stopped… I should point out when I was reading manga heavily I would be downloading nearly everything that gets released any given day. Honestly I think there would need to be 3 different database points. 1) serise 2) oneshot 3) porn we have to be honest, many people would be using this to sort their hentai (personally would love it for that, but honestly just a general manga reader would be awesome due to how shit acdsee is.) I say sort it this way because I personally have probably more oneshots in my collection then I have manga series, or its possibly getting up there, and if it ever came down to moving away from this as a manga reader, it would be so much easier if the series and oneshot were seperated. porn being porn, its rarely ever a multi book work so making a separation like that would be worthless. honestly, I think a db browser may be called for, one where you could choose which you wanted to go into, and once in, you would be presented a list of whats in it (in a user readable way) I also think making it acdsee compatible for button layout would also be good as I don't think I have come across a program with a better default layout for keys, before I had a 4k tv as a monitor, I used the arrow keys for scrolling, page up & down for next page previous page, backspace to go up folders, and enter to go in and fullscreen. I think having the program searchable and giveing you a 'virtual' folder for your search would also be good, think how hydrus works now for images, except instead of images its archives/folders you can go into till you find images, then backspace till you are at the top level of the search. I believe that hydrus as it is right now is almost at the point where this is doable, or at least would be able to be workable level of polish. I also think you could intergate most of the aspects inot hydrus proper but disable them from normal client use so you're not maintaining two completely separate code bases. this is another aspect where I have tried and failed miserably to come up with a good tagging method where hydrus would easily solve an issue. with that note, you would have to modify one way tags are done, you would have to have when a specific book is selected some tags like lets say anal would have page numbers next to it like 20-25 60-71 84-85 112 at least for me if I tagged something with a very specific thing and I wanted to see it, needing to go through 200-500 pages to find it… I can see that as an issue that many of the manga/hentai readers come across but have never tried to fix. but that's just some of my thoughts on it.
(248.18 KB 2066x1764 client_2018-08-18_15-10-21.png)

>>9729 Oh and I forgot, If you have any idea how to deal with the bit that I circled in red, It would be much appreciated, I think its trying to do something like generate thumbnails inside the archies, but I have no idea, older versions never took 5-10 minutes before a folder was workable.
>>9731 going to look into derpi a bit later, this is the tooltip, when the program hangs, this tooltip is stuck in the foreground till the program starts to respond long enough for it to fade.
>>9732 yeah i remember this issue, i was the guy who had it with like 1000 pictures or so >>9732 >Can FA links like the 8758805 one have multiple files, or is e621 in the wrong here with their URL attribution? FA links never have multiple files like pixiv or inkbunny, FA subs are 1 image per sub. i think i can see the problem: one of the 2 images is a fan-edit. The person who uploaded the edit put the source of the original, hence the confusion in the software, because both link to the same original submission in FA. this will probably repeat itself in the future with user-edits uploaded to e6 simultaneously with the original picture, as long as the users link to the original in both the original and the edits. am i on target here or am i misunderstanding how the URL recognition works? >>9727 Tumblr pixelated us to hell, but we're going even deeper, we're taking it ALL back!
(20.29 KB 430x319 1496027649394.jpg)

FUCK IT I'M REDOWNLOADING THEM ALL AND THEN SORTING THE TENS OF THOUSANDS OF DUPES NO GUTS NO GLORY NO CAJONES NO COLLECTION THIS WAY HYDRUS WILL ALWAYS RETAIN THE DELETED URLS AND KNOW THE TRUE VERSIONS OF ALL OF THESE IMAGES IF I DON'T MAKE IT BACK TELL STALLMAN TO STOP MOLESTING BIRBS FOR ME
>>9721 Why? What happened?
>>9740 some retarded soyboy bitch who couldn't be bothered resizing his files before uploading them to tumblr complained to tumblr that people were getting his patreon content through it because of the _raw URLs, so tumblr removed _raw urls all together, because soyboy bitches reason through groupthink instead of logic.
>>9742 So even the Amazon link fails?
>>9742 would love to know who it was.
>>9735 I don't have that in my version, but I think I remember maybe turning something like that off when I was using it more. There's some auto-backup options somewhere, maybe there are some indexing options near there? Like you can tell it not to check for updates to file structure or something. That shit was one reason why I started thinking about hydrus, actually. The difficulty of keeping up with a disparate file structure. >>9737 Yeah, looks like e621 has several examples of these unreliable source urls. I will add options to file import options to change the url check logic here (e.g. '[] don't trust source urls for 'already in db/deleted' checks). >>9744 Yeah, it is all gone. The amazon links still respond in different ways, suggesting the 'bucket' still exists, but you get variants of 'authorisation missing'. I presume they have basically tightened up their AWS-side back-end CDN to require authentication, so it is back to being a way for their cache servers to grab originals. tumblr are aware of the issue and have strongly suggested it is never coming back, so I imagine if we ever find a new way back in, they will only shut it down again. This is happening all over the place. My feeling is the future of original source is going to be the various boorus 'inheriting' from the actual original source, be that patreon or some tumblr post that links to HD on imgur or mega or whatever (although if the boorus use CloudFlare, they may be compressing jpegs on the fly! fuck!). In five years, I expect hydrus to have P2P file-sharing components. You can trust that any system I make wouldn't do this shit. But I can also imagine more complicated waifu2x-style systems in future that may nullify most of the issue.
>>9745 A furporn Patreon artist


Forms
Delete
Report
Quick Reply