/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

8chan Karaoke Night!

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(13.63 KB 480x360 8oKeK8yGOGE.jpg)

Version 327 hydrus_dev 10/24/2018 (Wed) 22:06:36 Id: cec9be No. 10359
https://www.youtube.com/watch?v=8oKeK8yGOGE windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v327/Hydrus.Network.327.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v327/Hydrus.Network.327.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v327/Hydrus.Network.327.-.OS.X.-.App.dmg tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v327/Hydrus.Network.327.-.OS.X.-.Extract.only.tar.gz linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v327/Hydrus.Network.327.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v327.tar.gz I had an ok week. The login manager is moved forward, and I did a few other things as well. login manager Still advanced users only. The login manager now saves itself, and I've don the dialog for regular users to set up their login credentials, under network->downloaders->manage logins. Feel free to poke around. It stil isn't switched on, so custom login scripts will not run yet. The advanced side is missing some good test ui, and the regular side is missing a similar 'attempt login/test these credentials'. I'd like to roll something out next week and turn the system on at the same time. misc The spammy/false-positive 'periodic file limit' warnings that subscriptions started giving last week with the new, more liberal syncing check are now gone. You'll now only get a warning about the file limit when it matters (when the sub only found new files on the check). Subscription merge is a bit more helpful. It lets you select which is the 'prime' sub, into which to merge the others, and has some other behind-the-scenes data improvements as well. Several new pixiv downloading issues, mostly to do with multi-page content like manga, are fixed. The manage tag censorship, siblings and parents dialogs are now all moved to the new panel system, a notebook instead of the buggy old listbook, and also use my recent new sizer. They should fit better on smaller screens and expand better on larger ones. Let me know if you still have trouble with them, as I can tweak a lot more easily now. I have finished some new 'getting started' help for using the new download system. If you have yet to download anything in client but want to try, please check it out here: https://hydrusnetwork.github.io/hydrus/help/getting_started_downloading.html full list - login stuff: - finished off some login script data stuff - fleshed out how login credentials and other linked data is stored in the login manager, including script link recovery when the script changes but name does not - improved some initialisation login validation error handling - improved login failure validation error handling - wrote a dialog panel for managing login credentials and reviewing validity and so on - a heap of related session and login tie-in/fix-up work - the login manager will now save changes to the db. it will get the HF and pixiv scripts on db creation/update, and if you have a pixiv login, the login system will pre-fill that info and 'activate' the script (although the login manager will not fire any login scripts yet–if so configured, it'll just delay on a polite error message) - . - other stuff: - with the subscriptions' new more liberal syncing logic, the periodic file limit will now only pop up if the sub does not see any already-seen files - to give more buffer for the new syncing logic, file import caches will now store 250 entries minimum on compaction (was 100 previously)
[Expand Post]- subscription merging now lets you choose the primary subscription into which the other subs will be merged - cancelling a subscription merge action mid-merge is now safely nullipotent - post urls that use subsidiary page parsers (such as the new pixiv manga parser) will now correctly insert (rather than append) their manga urls into the file import cache - removed a couple of places where urls could accidentally be duplicated in a file import cache - cleaned up some areas where successful file import objects were presumed to have file hashes when they might not (this was causing errors when importing urls that split into multiple url children, like pixiv manga, while also having 'additional tags' set) - updated tag censorship, parents, and siblings dialogs to the new panel system - tag censorship, parents, and siblings panels now use a notebook instead of the layout-borked listbook - tag parents and siblings panels now use the new small-resolution-friendly sizer, are more tight by default, and expand more neatly - refactored a bunch of tag ui code to clientguitags - the client video renderer will now deal with videos with (invalid) duration of 0 more gracefully - finished the 'getting started with downloading' help page, sans the login stuff - bit of other help work next week I would like to 'finish' the login manager, which I think will mean turn it on at a 1.0 state. I suspect I will have to go back a bit for a week or two more to clean things up, and I'd also like to roll out some login scripts (and downloader updates) for Deviant Art and Fur Affinity and some other big sites. We'll see how it shakes out.
>>10354 damn, if thats all computer processing doing it, shit has come a really long way that said, with the way people compress png right now there are methods to fix it, and with jpeg, there are fairly predictable ways shit can get compressed, at least on light, this person took a 1mb image and made it 200kb could fairly easily be compressed, but jesus christ would it be a nightmare for a 1mb to 50kb compression with generating art based on artists, that's quite a bit more iffy, completely doable today, but the issue comes in, who is going to feed the program enough of a persons art, or what artists are at the pinnacle of what they do and you could legitimately put their art into something and have it read it till it understands the why. I'm also iffy on decensoring, it would look good, but unless you have the artists samples, generating based off that would also be fairly hard and require human touches, Line art right now, we have waifux2, and we also have photoshops pallet knife tool, when I would scan shit I do in I would use that to generate a more than good enough digital lineart, a mix of the two could very easily do lineart, but yea, if my memory serves, the difference between 100% uncompressed and losing 5% of the detail is not noticeable by human eye, but the difference in file size is 1/10 to 1/20 granted almost all my tests this way have been video and full motion can hide quite a bit of the loss. >>10355 yes, my issue comes in at around 5-7 galleries downloading at once, them finding the files doesn't fuck with the program, but them downloading them can grind it to a halt, with thread watchers, yea, I want them hit right away, but the way I deal with galleries, if I am able to start them paused for file downloading that would be a great help. >>10357 lol it brings the number up to 230k, and that's just 1.5 million files, I have another 1.3-1.7 million that dup detector currently isn't seeing due to a db fuck up nearly a year ago now, and that was at around 130-170k on its own too, compound that with im waiting on some form of visible download notations so I know why files have been deleted before I cull anything, and this is becoming a bigger and bigger issue.
When downloading a 4chan/8chan thread with the thread watcher and with 'get tags' enabled, it will only grab the last filename in the thread and apply it to all of the files.
(13.97 MB 4140x6045 IMG_20181015_0006.jpg)

holy shit, I just found something that has pissed me off more then almost anything else has in a while and I thought I would share it because of the compression talk last update. This image here is about 90% perfect, if I took it into photoshop and made the whites white and the dark areas black, It could actually be a 2 color image, as it stands, without touch up, I can make it a 3 color image and loose almost 0 quality and it becomes a 2.23mb png, however the dipshit who scanned it fucked up the right 10% so anything less then 32 colors looks is fucking horrible, but holy shit, in all irony converting the images to png even with stupid amounts of color options STILL comes out at a lower file size… I mean holy shit how do you fuck up like this? Looking into it a bit more, and trying to do this all without going to photoshop to fix things, applying dithering to it, which usually adds a significant chunk of space, actually fixes the issue, and a 2 color almost works, the main issue is paper grain… Honestly I think I will take this into photoshop… also going to dump a mega link here to my experiments because at the very least I find them interesting https://mega.nz/#!jpBDUaxQ!F59f43bm0RZNe_3lv021j7VKxFQVroly0sPTpGzIw54 think ill dick around some more after a photoshop to bring out the blacks and try to suppress some of the paper.
Could we get a 'hide existing siblings' checkbox to cut down on the (will display as) spam on the manage siblings panel? The manage tags panel should have a 'collapse siblings' checkbox for the same reason.
I made some progress with running the python source. Now it complains because it can't import "ordered_dict". I posted the details in the old thread, since I forgot to check for the new one: >>10377 tl;dr: I can't figure out which package to install to get "ordered_dict". Similar sounding ones didn't fix it. Any help would be appreciated.
(3.82 MB 8280x12090 2x_super_sampling_1bit.png)

>>10375 Hm, the guy probably didn't want to cut the book, or bend the spine all the way open, so it ended up out of focus. The problem with this page is, that it was painted digitally as gray scale, so the half-tone pattern is generated automatically, which results in a much smaller pattern than manga pages with manually applied patterns. I think for 1-bit color, the scan's resolution is not high enough. And the used scanner is crap too. The half-tone pattern is too blurry, there is a lot of aliasing and color noise, so the pattern's dot size in the 1-bit conversion doesn't look uniform enough. You would need super-sampling with at least 600 dpi, which would increase the file size though. I tried making my own super-sampled 1-bit version. It was converted to gray scale, and upscaled to 200%, using b-spline interpolation in krita. B-splines are smooth curves that don't produce ringing artifacts (which are a problem in bi-cubic or lanczos sampling), so it turns the pixelated dots into nice round dots. Then I ran auto-contrast to get sane levels and afterwards applied the threshold filter, with a manually adjusted threshold of 160, because the bright shading on her thighs turned fully white at the default 128 threshold. With a size of 4 MB it's somewhere in the neighborhood of your 4 color version. I don't have any ideas for restoring the half-tone pattern in the out-of-focus area yet. The floyd-steinberg dithering you tried looks too noisy for my taste. Generating a new half-tone pattern would be possible, but matching it to the pattern on the rest of the page would need a lot of manual work.
Ok, thinking a bit more on this from last thread, with having things started paused, I got an idea. so, loading in a bunch of galleries or threads makes the program hang, the degree is something to question as not everyone has my retarded setup. however thinking on it further I had a few ideas. 1) only work on the file side This is because just looking up the website to find files is usually a quick and painless experience, 4chan grabs everything right away, and galleries lookup with long enough in between times to not be too disruptive. 2)a play all files button The program should know which queries have files to get and pressing that button would just start it up 3)a pause all files button Worst come to worse and you need the program to be responsive, a pause all button would help 4a) a delay option to unpausing. So, let me put the way I would use this out there. I would go to 4chan, I would harvest threads I find interesting/worth saving images from, and I would import them all paused, with an exception for /b/ probably. once all the threads were loaded in, I would unpause all the files, but if there isn't a delay, it would likely take 5-10 minutes a board for everything to catch up and be responsive, now this isn't an issue if I have all the files going at once, but currently will get 2 or 3 of the slower boards in before it's completely unresponsive wait, then get most of the rest 1 board at a time. 4b)a universal drop down menu 'unpause all file' option this would just look for queries that have files that have not been harvested and unpause them so they start getting files for me just looking at my most recent imports there is about a spread of 2 hours from when I started the imports to when I stopped, if I do this before I go to sleep, its not an issue how long the program hangs, but if it hangs mid harvesting, I have to stop what im doing and sit on files and links till im able to import. granted, thinking of this a bit more, part of the problem I have could be solved if there was a menu part for where the source was from. lets say for example https://8ch.net/hydrus/index.html https://boards.4chan.org/a/ are the two boards it would go (subject) (domain) (board) - the rest as normal so Version 327 | 8chan | /hydrus/ Fuck is this garbage | 4chan | /a/ or possibly Version 327 | 8/hydrus/ Fuck is this garbage | 4/a/ as the only things that use watchers currently are 8chan and 4chan, mix this in with a possible 2 sort stack so you could sort by domain/board along with most recent, this would allow every query to be on one page opposed to my current sorted method of 11 watchers ————– That said, on update every single watcher had their shit updated to show new file counts, honestly couldn't be happier with how that turned out.
>>10380 mine is completely un touched by a program to get contrast right, if you still have the image, try passing it through the dither filter, my main problem with people who do the scans is we have effectively a 1 bit image, but they don't bother trying to make the file 1 bit, I'm just going to say personally I have a 4k 55 inch and most of the time, whether the image is larger then the screen or not, I limit it to adjust screen height, so for me that is 2160 or about 1/3 or 1/9th the resolution of the actual image, with the dither, outsize of zooming in, it looked near perfect, but note the spot on the left leg highlights, in the actual image there are 3 or 4 different highlight areas, yours gets 2 or 3 of them its hard to tell, but with a dither it may get them all, and when whatever you use to read manga resizes them and makes its own samples, that may actually be enough for the image to be near perfect. but god damn, 4mb for a 8000x12000 image that looks that good even though its not perfect. as for not wanting to ruin the binding, that's plausible, but it was a 20 or so page book, it could have been done with a staple, there are also scanners that are specially made for scanning things like this that go right into the crease without creating the spine issue. but then, to distribute they saved it as a png that took nearly 14 mb a page when you can get even 4 times the pixel count for less if you have any idea what you are doing. I didn't even notice the blurriness till I tried to convert they could have very easily dithered it and I may have never caught on. I wish more people understood this. and apparently some of them did a long fuck off time ago, as some of my older dbz/ranma scans are gif's instead of jpegs, same principal of the png 1 bit, but back when size was even more of a consideration, it seems that ever since people got big hdd's they just don't care at all about file size.
>>10378 >>10333 +1 on this. I'm running Manjaro and they just rolled out an update, which seems to be what broke it (I was running it fine literally just before I updated). I've been using yay to keep it up to date from the AUR (which, incidentally, is still serving version 326). These are some of the upgrades that may have done me in, dunno. Might put it back on my windows machine until i can figure something out. python-urllib3 python-requests python2-urllib3 python2-requests python2-zope-interface python2-twisted
>>10383 Ok seems like you already sort of know what's going on. Commenting out the offending code for now, we'll see if it breaks anything
Is it me, or is someone occupying #hydrus in irc.rizon.net ?
>>10360 I don't know enough of the specifics, but I understand these colourising tools have multiple variables to change tint and saturation and so on, so they still need some humans to steer them and get something looking pretty, and I assume another layer of human touchup. It is still early stages, but it seems completely possible. The tech only gets better from here. I feel that for training ML for 'this is what x looks like', hydrus and other large metadata archives can be exactly the right tool. We can easily generate a giant selection of fairly accurate 'this image has "ribbon"' with which to train a model, and I hope our tag corpus will be something we can evolve from for our own personal purposes. Again, I don't think we'll be generating unique da Vincis any time soon, but I expect to start auto-tagging using ML in the next few years. Some ML projects do 'take this image and make it look like it was painted by x'. I think we could head in something like that direction, particularly for things like clearing up jpeg artifacts, either on a per artist scale or more like a 'this is mostly what lines look like in manga, so nice line pixels should look like this' way. I am not enough of an expert on this, and I guess most of this tech is still up in the air and not yet written to know for sure what we'll really practically be dealing with. For me, it is mostly just fun to talk and think about. The Illust2vec project used danbooru as a model, iirc, and was breddy gud at suggesting tags. More than anything, I care a lot about running this stuff on local machines with local CPU/GPU cycles. Google et al are rushing ahead with ML and will certainly be the first to do interesting real things, but I'd rather have privacy and freedom.
>>10368 Thank you for this report. I will check this out this week.
There's an issue with the pixiv manga downloader this week, and some other similar single-page->multiple-file results like artstation. The original url gets 'failed' but the files go through ok. This is a false-positive error I will have fixed for v328. If your subscriptions notice this and complain, please hang in there. I apologise for any inconvenience.
>>10375 >>10380 >>10382 Interesting that the region on the right is more visible in the thumb, like some moiré.
>>10376 That's a good thought, thanks. I'll be moving the checkboxes in manage tags to a cog icon this week or next, so I'll roll that into that. I'll see if I can figure something out for manage siblings as well.
>>10378 >>10383 >>10384 Thanks, yeah, it looks like the new version of requests can't unpickle a session from the old version. v327 should have a fix for this, basically just by clearing your sessions. If you can't get v327 due to AUR whatever, the manual fix is: MAKE A BACKUP OF CLIENT.DB run the sqlite3 executable in the db dir, and enter (you should be able to copy/paste): .open client.db
DELETE FROM json_dumps WHERE dump_type = 46;
.exit
And try to boot the client. It'll moan that your sessions manager is missing and say it made a fresh one, but should otherwise be fine.
>>10381 I like the idea of a global pause/play files, and it wouldn't be super difficult to add either. I'll also add the cog icon to change default state for new queues.
>>10387 Some guy registered something hydrus related on rizon a while ago and told me. I said feel free, but I am not an irc person so I couldn't promise to hang out there or anything. I don't know if it is the same dude and channel.
>>10402 Alright, it starts from source now with 327 and seems to work without issues so far.
>>10413 >>10402 Hm, "Fatal IO error 11 (Resource temporarily unavailable) on X server :0" still occurs running from source though.
>>10402 I had it hang on a different X window error and ended up killing it: >The program 'client.pyw' received an X Window System error. >This probably reflects a bug in the program. >The error was 'BadRequest (invalid request code or no such operation)'. > (Details: serial 1284195 error_code 1 request_code 0 minor_code 0) > (Note to programmers: normally, X errors are reported asynchronously; > that is, you will receive the error a while after causing it. > To debug your program, run it with the –sync command line > option to change this behavior. You can then get a meaningful > backtrace from your debugger if you break on the gdk_x_error() function.) >Killed Overall the stability when running from source seems to be the same as running from the binary version. It's just these nasty X errors every now and then. Searching for the "Resource temporarily unavailable" error shows that it's often a problem of multiple threads accessing the same x server socket: https://stackoverflow.com/questions/13755366/xio-fatal-io-error-11-resource-temporarily-unavailable-on-x-server-0-after https://stackoverflow.com/questions/46265906/how-to-open-multiple-pygame Does hydrus do this?
(194.28 KB 740x1080 fixed.png)

>>10375 Just scale it down and run it through a noise filter a few times :^)
>>10417 While that works and does produce acceptable results there is an issue, its a fairly significant downgrade in quality. personally I would rather have extremely large resolution 1 bits that scale down in programs, this way with hither resolutions your images get better and you lose no detail, Ideally it would be a scan once and it's good forever kind of thing. granted if a vector scan happened that could see how big dots are and where they need to be placed, its possible that could outdo quality for relatively low file sizes, but thats a pipe dream till someone actually wants to archive actual print media and not just the print medias words.
>>10397 With what I know for coloring, it use to be done all by hand, but then you haver tools that follows objects on screen for special effects, you could apply that to this, where its tracking objects, then adding color on top and it does it automatically. with enough ai in the background you could easily make have 'looks like this, its brown, looks like this its green' for grass, Look into deep fakes, as this is likely the closest we will get to doing that to artists, people who would never make more, super imposed on porn scenes, and its community made so you see real results that don't have big businesses backing it with unlimited funding. as far as tagging goes, I still stand by everything should be human approved. >>10403 definitely looking forward to it
>>10416 >>10414 That's frustrating–it usually relieves it significantly, if not fixes it completely. When I moved over to the new wx last year, some Linux environments got a lot stricter about when I could access ui objects. Mostly it is accessing variables of ui objects from the non ui thread. I cleaned up a whole ton, as many bad accesses as I could find, but there must still be a couple out there. Unfortunately, because I am in python, it is tricky to get good tracebacks or dumps for these errors, and my test environments here don't have the problem any more. I am sorry that I can't offer a good fix here. If you notice that doing one thing tends to cause the error (or, in many cases, sets a state that will cause a crash up to 30 mins later), please let me know, and I'll have an area to focus on to see where I am screwing up. A trick I ran for a while when fixing this the first time was to have people just open the client and leave it for a bit, doing nothing. Then trying again, doing one thing, and then trying again doing another, to logically figure out what dialogs or actions were causing a crash state. If you have the patience and enthusiasm, this would be helpful to me. Can you say which flavour and version of linux you are on, and which window manager?
>>10419 Yeah, for tagging, my idea for a first version here is to add a new suggestions column to manage tags that'll provide rows like '95% confident of "equine_penis"'. So it'll just be a neat tool to make tagging quicker. Because of high error rates with this stuff, I always want human eyes confirming any automatic decision. By the time we can have good systems that don't have any human in the loop, I expect we'll be more concerned with stopping the cloud of flesh-eating nanobots from consuming everything we love.


Forms
Delete
Report
Quick Reply