/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Catalog Archive
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

8chan Karaoke Night!

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

Release Tomorrow! hydrus_dev 06/19/2019 (Wed) 01:01:04 Id: 81774d No. 12951 [Reply]
I had an excellent and busy week catching up after E3. As hoped, I completed the second phase of the duplicates overhaul–all better/worse/same quality duplicate information is now stored more logically and efficiently. Duplicates processing is faster and duplicate groups are easier to manage. I also fixed a heap of bugs, including the clipboard watcher popup spam and then recent OS X shortcuts issue, and improved some maintenance code. The exit splash screen now has a 'stop doing maintenance' button! The release should be as normal tomorrow, maybe a little late as there is a bit more to test.
>>12951 Based and redpilled.

(1.46 MB 1289x1821 68390508_p0.jpg)

Anonymous 05/23/2019 (Thu) 21:13:11 Id: 09abfd No. 12677 [Reply]
Hydrus dev, I'm pretty sure I invented the proof to the big bang. If I get popular, I'll shout you out. That's not why I made this thread, though. I'm so anxious I feel like vomiting. I never said this out loud anywhere, but besides porn I have nothing else in my life besides Hydrus. I use Hydrus extremely casually, anyway. But Hydrus helps sort porn, so… I was gonna shout it out. I was gonna shout out my two favorite artists, too. I didn't have any plans to shout out anything else… So I imagine it'd be a huge influx of traffic. I can't handle this at all. I never asked for this at all. No one will believe me until it happens anyway. I have an appointment to show it to someone next week. It's fucked. I wish I could just be in protective custody and sleep for 20 hours and it'll be better then maybe.
10 posts omitted.
(1003.00 B 113x60 ClipboardImage.png)

wow very cool
(72.41 KB 1200x910 Ck6tWhXUoAAUHNI.jpg)

>>12678 Stop lying about being the dev. Stop lying to yourself to achieve social results you don't actually want. Stop lying in saying that you didn't mean to say anything. Stop lying because if someone gave you what you asked for, you'd push it away because you didn't trust it. Don't lie. Ever.

(14.78 KB 480x360 NBpReDYG1Fk.jpg)

Version 354 hydrus_dev 05/29/2019 (Wed) 22:54:51 Id: 273809 No. 12755 [Reply] [Last]
https://www.youtube.com/watch?v=NBpReDYG1Fk windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v354.tar.gz I had a great week. The first duplicates storage update is done, and I got some neat misc fixes in as well. false positives and alternates The first version of the duplicates system did not store 'false positive' and 'alternates' relationships very efficiently. Furthermore, it was not until we really used it in real scenarios that we found the way we wanted to logically apply these states was also not being served well. This changes this week!

Message too long. Click here to view full text.

34 posts and 1 image omitted.
>>12841 Yeah, I do plan to add videos to the duplicate system, and I originally designed it to eventually support them. The recent file maintenance system was a step forward in prepping for the CPU work we'll need to do to retroactively crunch the data on this in a reasonable way. I plan to do something like what you propose. The duplicates system currently works on comparing still images' shapes with each other, and it allows for multiple still image 'phashes' per file, so my task is selecting a good number of useful frames from videos that will match with others. If it is reasonably possible, I would like do something more clever than just picking one frame per x time units or frames. This would line up right if our two vids were very exact conversions or resizes, but some of the codec changes drop a frame at the start or do 29.97fps vs 30fps bullshit that would desync our comparison. My original duplicates system did add vids by using the first frame, but so many have a black/flat colour first frame that it lead to a billion false positive dupes. Vids are no longer included, and I also drop anything that looks too much like a flat colour image from the system entirely. If I could instead find the x most 'interesting' frames of a video, then 2-second gif clips of 20-second webms would have a higher chance of being matched, and 30/60fps conversions would too. I don't know, though. That is probably beyond me to do well, or maybe I can hack something that is good enough. I could do something like generating a phash for every frame in the vid and then have them compete with each other to remove similar-looking frames/phashes until the 20 most unique were left. It might pick up a bunch of false positives again with, say, a black screen with a bit of white text on (like an 'intro' title card) though. Still, I am almost ready to do this now, and dupe work is proceeding, including more efficient storage of potential dupes, so maybe the answer here is to get a simple system in and then iterate on it.
>>12846 when I say literal garbage, I mean the image is drastically different, to the point I cant even see how it thought they were dups, but those are from the asshole who specifically fucked with dup detection when creating trash images, the dup detector is able to have enough wiggle room that even if the images aren't lining up perfectly, it may spit out something useable, and I used 1 second just because no mater what I watch 1 second isnt enough time for a 100% scene change, it should pick up some duplicates from that. on the title card, you could make a generic here is a black frame with text, and have a few variations of it, this could be used as a compare to X image and trace it if it does, so it would automatically know that everything with it will bee seen as a duplicate. If you are able to, try to get in contact with the people from what anime is this, and see how they did theirs, it may give some ideas.
>>12846 Could always check how Video Comparer works. It's the best video dupe finder software I've used.

(4.03 KB 480x360 u9lowRlI0EQ.jpg)

Version 353 hydrus_dev 05/22/2019 (Wed) 23:05:25 Id: e175d6 No. 12669 [Reply]
https://www.youtube.com/watch?v=u9lowRlI0EQ windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v353.tar.gz I had a great week. I finished the basics of the new file maintenance system I wanted, cleaned up the duplicate filter a little more, and fixed a bunch of bugs. file maintenance system There are a number of large file re-checking jobs the client wants to do, both now and in the future. Going back to figure out more accurate video durations and image rotations, discovering webms that were formerly incorrectly detected as mkvs, eventually integrating videos into the duplicate checking system, all of these will require a combined whack of maintenance CPU that I don't want to hit all at once. I have previously sketched out some disparate systems for these jobs, but none were really doing the trick, so this week I unified it all into one nice system that can handle all sorts of jobs. This new system is simple for now but will get more work in future.

Message too long. Click here to view full text.

12 posts and 1 image omitted.
>>12713 Move left, move right, move first, move last on the video tab I have stupidly large remove convert down not hydrus view again and decide with check before remove music video in for good measure

Message too long. Click here to view full text.

Ill post this again in the new thread too, I decided that the converted down images will get will get a rating of Converted Down/Keep Large with 2 stars and not selectable as an option it allows files to either be converted down and hidden from a search, or specifically keep the large file and hide it, also removed all files from the archive, everything needs a filter pass. these images that were converted down were interesting enough for me to keep/convert down but not really sure if I will long term storage them or not. when I en mass rated them to converted down I got that error InterfaceError Error binding parameter 0 - probably unsupported type. Traceback (most recent call last): File "include\HydrusDB.py", line 561, in _ProcessJob result = self._Write( action, *args, **kwargs ) File "include\ClientDB.py", line 12905, in _Write elif action == 'save_options': self._SaveOptions( *args, **kwargs ) File "include\ClientDB.py", line 10112, in _SaveOptions self._c.execute( 'UPDATE options SET options = ?;', ( options, ) )

Message too long. Click here to view full text.

>>12753 Thank you for this report. This is an odd error–the problem here is that the options object is not being serialised into the db correctly. This is the kind of error I see when someone has like a 2GB subscription, and SQLite falls over trying to make a buffer big enough for it. The options object there is a small thing, automatically saved at various points. Perhaps it was being nullified in some way, or some invalid data was being added to it. I can't think immediately why it would be affected by a rating set event. Can you say more about this rating conversion? You had a 2-star rating service as set up as under manage services, and then when you did a giant ctrl+a->f4->set rating->ok, it took a moment to write that and then popped up this error right at that time? Had you set any other options recently, either in file->options or via one of the 'cog' menu buttons? Anything related to default sort based on that new(?) rating service?

(24.70 KB 480x360 C7dVfQGn-0E.jpg)

Version 352 hydrus_dev 05/15/2019 (Wed) 22:05:33 Id: 6c1b1a No. 12583 [Reply]
https://www.youtube.com/watch?v=C7dVfQGn-0E windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v352.tar.gz I had a good week. .ico files are now supported, 'collect by' status is remembered in gui sessions, and I fixed a bunch of bugs. duplicate overhaul plans I started the duplicate overhaul work this week with some planning and experimentation with existing data. My original thought here had been to exactly replicate existing functionality just with a more efficient database schema, but having gone through the various edge-case insertion and merge operations, I believe the current system is overcomplicated for what we are actually using it for.

Message too long. Click here to view full text.

21 posts omitted.
>>12633 I suggested 30 minutes as the obviously safe option, where no one could possibly watch a preview for that long, but the majority of AFK sessions would be longer than that. I wouldn't expect to see much reduction in sessions pruned by going with 10 minutes, but who the hell is legitimately watching the preview for 12 minutes and wants that time recorded? Honestly, I'm not convinced this needs to be a setting. 10 minutes is way more than almost anyone will ever watch in a thumb. 60 minutes would cut out the vast majority of AFK sessions. It's such a broad range to work in that 30 minutes would do the right thing in 95%+ of cases (and I'm being cautious with my numbers), so why add more clutter to the settings for something almost no one would need to change?
(8.40 KB 815x24 media viewer.jpg)

>>12633 >>12654 Could any potential solution for this be extended to the media viewer? This seems like the same problem of getting distracted during, say, an archive/delete session, and leaving one image open (but minimized) for a few hours. I've had this, but also a similar problem with the media viewer (pic related). I basically had a bunch of 'alternate/duplicate' files, and I was flipping between all of them hundreds of times trying to decide which to keep, tags etc. Is it possible to detect if the client doesn't have focus, and stop tracking immediately? I feel like that would solve 90% of issues right off the bat. Otherwise, tying it in with the client's inbuilt 'idle' state could also work well (and is already user configurable in options -> maintenance and processing).
>>12656 >>12654 Thanks lads, only catching up now. This isn't in yet, but I'll keep this in mind as I do this system. I'll do min/max for media viewer as well. Options are easy to add, and I know someone will want to say 'no minimum time' or whatever, so I'll throw them in for anyone who is interested. 5s/10m sounds like an ok default min/max. I think I can catch a focus lost event, although some of that stuff is a little unreliable. I'll play around with it and we'll try iterating on this a bit.

(18.29 KB 480x360 J-p_0FDlpkw.jpg)

Sorting Motivation Thread Anonymous 05/21/2019 (Tue) 03:41:22 Id: ad7f64 No. 12651 [Reply]
https://www.youtube.com/watch?v=J-p_0FDlpkw ITT we post about our duplicates, tagging, and so on - our stats, goals, landmarks, and so on. Never give up anons, that tidy catalog of content is waiting for you!
12 posts and 1 image omitted.
>>12758 I do both. >you might have a bunch of images you dont care about That's what the archive/delete filter is for.
I have 1TB of inbox to go through while 800GB are now properly archived, as in, I have personally checked them and stuff. I've stopped working on dupes till the current dupe rework is done, since alternate not copying tags nukes my workflow, sitting at 80k dupes.
OP here, down under 60k dupes finally. >>12758 I save those to Downloads with the most important tags in their title for later import.

Release Tomorrow! hydrus_dev 05/29/2019 (Wed) 01:32:36 Id: 8080a4 No. 12749 [Reply]
I had a great week. The new 'false positive' and 'alternates' duplicates db storage design and migration worked out well, although it did take most of my time. Other than that, I did a variety of little new options and bug fixes. The annoying bug where a handful of thumbnails sometimes stop fading in seems to be finally fixed! The release should be as normal tomorrow.
1 post and 1 image omitted.
>>12751 I love hydrus man and hope he is doing well in all aspects of his life.
>>12752 >>12751 I am overall healthy and in a stable money situation. I hope you are as well. I get blackpilled sometimes, but then I remember my true troubles are all internal and that I have a lot to be grateful for compared to my ancestors. I still want to keep pushing on Hydrus every week for the foreseeable future. I love imageboards and all other Anons, including you–the cheeky fun we get up to keeps my soul going.
>>12754 Based and Bloomerpilled

(905.00 B 184x184 1455677.png)

Bibanon and WebUI thread Anonymous 09/14/2018 (Fri) 09:13:53 Id: bcbe46 No. 9960 [Reply]
It looks like the Bibanon wants to use Hydrus as a server, and they really want to slap a webUI on top of it… but Hydrus Dev don't have it on priority list. They have though about using Szurubooru for the WebUI but rr- don't want to update it. See: https://github.com/rr-/szurubooru and rr-@sakuya.pl Maybe it is time to draft up a sample API set and WebUI for them. From antonizoon 2 Sep 2018 Hey there, it's been a very long while, but it's good to see that you've
kept working on Hydrus ever since.

One of the issues with hydrus for me in the end, which is why we never
ended up using it, is because of the lack of a webui. As such I ended up
utilizing Szurubooru, as can be seen here: https://eikonos.bibanon.org

However, what I can say is that the dev team at the Library of Congress
was very interested in adding tag support to their image tag system
which is below. It's currently utilized with the aim of crowdsourcing
transcriptions of images, but it lacks tagging or search like we would
have in a booru. Luckily, we already have the hydrus protocol and
szurubooru right? They aim to launch in October.

I find this to be a perfect chance to integrate Hydrus with a webui as
well as obtain some useful features from their existing system as well.
What do you think?

https://github.com/LibraryOfCongress/concordia
Possible CSS frameworks https://tutorialzine.com/2018/05/10-lightweight-css-frameworks-you-should-know-about https://dzone.com/articles/top-10-lightweight-css-frameworks-for-building-fas https://www.catswhocode.com/blog/top-10-lightweight-css-frameworks-for-building-fast-websites-in-2018 http://www.creativeweblogix.com/blog/12-lightweight-css-frameworks https://speckyboy.com/responsive-lightweight-css-frameworks/ Also whether to use Angular/React/Vue (JS web) vs Django/CherryPy/TurboGears/Flask/ Pyramid (Python Web) comes into question Contacts

Message too long. Click here to view full text.

2 posts and 1 image omitted.

(111.13 KB 400x400 1394958218797.jpg)

Q&A Thread: For simple questions that don't need their own thread Anonymous 07/04/2018 (Wed) 23:06:01 Id: 8ea9e7 No. 9327 [Reply] [Last]
Here you can ask questions so that the board is not clogged with small threads. >>6021 has reached its bump limit, so I made a new thread.
748 posts and 1 image omitted.
>>12593 >>12594 Yeah, try tag siblings: https://hydrusnetwork.github.io/hydrus/help/advanced_siblings.html The system is very imperfect, but it generally works. More time will be put into this this year, including better clientside preference management for tags synced over the PTR.
>>12596 There is no solution for this yet. I expect a future version of the tag siblings system to allow some sort of regex-based replacement (think a global rule for 'replace all "\s" with "_"') that will allow this. Since desiring underscores to either go away or be mandated is common, I am likely to hardcode a solution to manage this earlier than a generalised regex solution, but the tag siblings system just isn't clever enough to handle it yet.
>>12629 >>12632 I have yet to come down to a good description for what namespaces are myself. Here is my current first draft: Namespaces are good when: They are a higher category (evangelion is a series) They are useful to highlight (characters are important) They are useful to search for (creators are often searched for, and 'creator:*anything*' has value) They are useful to group together (multiple creators is useful to know, when listed next to each other) I've been approving 'clothing:' siblings on the PTR for a little while now and I overall like it. I'm mixed on some others though. My ideal solution here is to extend tag siblings to allow clientside preferences and then for you to say "If a group of siblings includes one with 'hair:' namespace, prefer that". I have found that it is easy for users to objectively agree that 'hair:long hair' and 'long hair' have the same semantic meaning, but the big subjective disagreement is over which is better. An actually bad namespace, imo, is one that breaks the first rule above. 'male:erection' is an artifact of how some gallery sites do various female/male focus on tags, but I would rather 'male erection' (which could nicely have parents 'erection' and 'gender:male'). 'erection' is not a 'male'. Also 'erection:male'. 'general:coffee cup' and 'object:pencil' are technically correct but overspecific for most users, but again I think the specificity of namespaces is highly subjective, so the true answer is to let users define what namespaces tags could have and let them then customise what they prefer to show.

Message too long. Click here to view full text.


(13.60 KB 480x360 Xx4SFBlaVWQ.jpg)

Version 351 hydrus_dev 05/08/2019 (Wed) 22:10:21 Id: 768770 No. 12507 [Reply]
https://www.youtube.com/watch?v=Xx4SFBlaVWQ windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v351/Hydrus.Network.351.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v351/Hydrus.Network.351.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v351/Hydrus.Network.351.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v351/Hydrus.Network.351.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v351.tar.gz I had an ok week. I sped up several systems and added a new processing panel to the duplicate filter. The 'next big job' poll is finished! I will next be focusing on overhauling the duplicate filter's db structure (including sketching out support for file 'alternates') and further improving the ui-side workflow. duplicate filter

Message too long. Click here to view full text.

11 posts and 1 image omitted.
>>12520 Yeah, I don't like the sort of the sort. I've found it tricky to figure out a concise way of saying some of this stuff, like age vs time imported. Yeah, perhaps something like 'sort by dimensions: width' would help here to group them better. I'll make a job and think about this a bit.
>>12525 Thank you for this report! Sorry, I fucked up the new focus logic when adding the always-on-top ability. I'll 99.7% have this fixed for 352.
>>12554 personally, just sorting them between file facts and database entries would be enough. database would be things like tags, rateings, so on so forth that is really only applicable in program file facts would be things like pixel amount, file size, hight, width so on. this would at least clearly separate the two. >>12551 now that you say putting it on another monitor, a floating window would be perfect. Currently I have a 4k monitor, I have the files open as the right half of the monitor, a magnifying glass as the second monitor, and if this stuff was floating, I could easily possession it wherever the hell I please and not care at all. not sure how easy that would be but that would be a solution that works as a fits most use cases.

Next Big Job Poll hydrus_dev 04/24/2019 (Wed) 21:39:10 Id: 3ccbc4 No. 12358 [Reply] [Last]
Here is the poll for the next big job: https://www.poll-maker.com/poll2331269x9ae447d5-67 You can vote for multiple items. I expect to start work on the top-voted item at roughly the time of the v351 release post, in two weeks. Please feel free to discuss and ask about items on the poll in this thread.
52 posts and 1 image omitted.
>>12506 Check the big list of repos in >>12295 especially https://github.com/andrewekhalel/sewar (it will come in handy)
>>12511 >>12506 1a)I am chiefly limited in how much time I have. I'd love to do more complicated autodecision workflows and backup ui, and I think that sort of thing would help a lot, but I don't have time in this rewrite. I also read up on some jpeg quality estimation here https://www.politesi.polimi.it/bitstream/10589/132721/1/2017_04_Chen.pdf , but it is too complicated for me to implement in the time I have. I would also like to focus on the db side more in this cycle. 1b)I am moving to a simpler comparison system in this rewrite. You'll always be seeing files compared to the 'best' of a group once it is done, which should exaggerate filesize and resolution differences. I am squeamish about autodecision on filesize or resolution alone as there are plenty of stupid bloated pngs of jpegs out there, but I think that bias could be part of a larger system that takes multiple variables to auto-decide. 1c)Sorry, I just don't have time to write clever ui like this atm. 1d)Yeah, I am afraid of the edge cases here. My thrust will always be default to off and lots of user customisation. I'll prep any auto-system with rules like "if exact same pixels and one is jpeg one is png, the jpeg is better", but I'd like for you eventually to be able to write your own rules for what you want out of it. 1e)Yeah, single pixel edits are a problem here. In a future iteration of any autodecision system that took multiple rules to make decisions, I think a blanket "if one pixel different, the older file is better" could be the ticket. There's also issues with file metadata being stripped or altered by CDNs. 2a)Yeah, that's the difficult stuff. Any dupe filter can't be simple, or any simple rules should be able to gauge certainty and pass the decision up to human eyes when something smells fishy. This clashes with situations like waifu2x, where it is a blow-up of the original, but a clever and presumably desireable one. 2b&c)My thoughts on the autodecision system would be to have it run in the background for very easy decisions, but confirming decisions (or maybe just some decisions with low confidence) with the user could be another way to go. 3)Duplicate metadata is not deleted when files are, so the delete/reimport cycle doesn't affect it. Saying 'remove this from all dupe pairs and requeue in the system' is not easy at the moment (I think you'll have to do a bunch of right-clicking on the thumbnail in advanced mode to show the exact pairs and then sever the relationships and requeue), but this will be easier with the new data structure I am designing. In the new system, I'd also like thumbs to load with their dupes, whereas at the moment that info is only fetched on right-click.
>>12542 yea all of 1 more or less required the image difference comparison to be a thing. as for 1e its not even single pixels, its resaveing the exact same image over and over again as jpeg slowly corrupting it over the course of 100-150 images. that's where I see a slow better worse auto filter jumping from a good image and quickly ending with the most corrupted one is no fail safe was implemented.

(4.03 KB 480x360 JEXW8reB57A.jpg)

Version 350 hydrus_dev 05/01/2019 (Wed) 22:09:28 Id: d3155b No. 12458 [Reply]
https://www.youtube.com/watch?v=JEXW8reB57A windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v350/Hydrus.Network.350.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v350/Hydrus.Network.350.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v350/Hydrus.Network.350.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v350/Hydrus.Network.350.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v350.tar.gz I had an ok week. Some IRL things cut into my hydrus time, but I got some good work done. Some bugs in the new duplicate search system are fixed, and I improved advanced file delete and export handling. The poll for the next 'big job' is up here: >>12358 duplicate filter The search addition to the duplicate filter went fairly well, but there were a couple of significant bugs. The 'ghost pair' issue–where a queue would sometimes have a final pair that would never display and lead to high CPU until the filter was closed–is fixed, and safeguards added to catch similar issues in future. The issue with undercounting on large search domains (typically where the dupe page's file query was non-system:everything and covered >10,000 files) is also fixed, but giving the filter 500,000 custom files to work with can be really quite slow. I will keep working here to see if I can speed up big searches like this without compromising accuracy, but if you find your dupe searches are working too slow, try adding a creator: tag to bring the search size down–it works well.

Message too long. Click here to view full text.

10 posts and 1 image omitted.
>>12488 I dove into the code here as part of my research into the dupe db overhaul yesterday and discovered some problems in my original 1.0. I feel less confident about it as I did for certain edge cases where 4+ files want to figure out what worst->best order they should go in. What happens to tags and whether the 'worse' of a pair is deleted is up to the duplicate merge options (check the cog icon on the dupe filter to edit these). I believe the defaults are to delete the worse of a pair, but if for some reason yours is not set this way, I think this will help your processing. Clearing 'bad' files out of the local file domain clears out some unusual AB, BC, AC pair comparisons the current system wants to do. My plans for the new storage system will do away with the over-autistic comparisons here and push for a simpler model with groups of dupe files with a defined 'best quality' King.
>>12498 The archive/delete filter still uses the old delete system atm, just sending files to trash with a default reason. I would have added the new delete dialog to it, and likely will in future, but the batched way archive/delete filter stores up and commits its actions make it more complicated to integrate, so I put it off.
>>12499 I will not hardcode number shortcuts for certain controls for now. I think macros to tab&space navigate the dialog are your best bet atm. That's interesting about SFM, I didn't know about the image-based rendering. I had heard the software tends to crash a lot, especially as artists are often working on older laptops etc… Sounds like a nightmare. I assume there is no easy one-click open source workflow–that for instance just neatly eats up their pngs and throws them at ffmpeg–for these users, unless Handbrake can do something like that?

Release Tomorrow! hydrus_dev 05/07/2019 (Tue) 23:05:24 Id: d9ac2c No. 12503 [Reply]
I had an ok week. I cleaned a whole bunch of code, sped up image importing and the new duplicate operations, and wrote a new always-on-top action panel for the duplicate filter that makes for faster processing. The release should be as normal tomorrow.
1 post and 1 image omitted.
>>12504 Yeah, we might be fucked. 560M mappings already. I'll probably reserve the next 'big job' after this dupe work to trying to sort it out. A more temporary fix I may apply is a daily 10GB limit, which will stop uploaders from sending so much and throttle the incoming tags more smoothly over the month.
>>12509 *sigh If only we have a more web-friendly way of dealing with tags… Not saying social media works, but it is the first thing that came to my mind e.g. voting, crowd moderation etc. using psuedonym? Sorry dev
>>12513 No worries. Having too much is a nice problem to have, and I have many ideas on how to mitigate our problems. Clearing out the (1) and done tags like 'title:tuhentu honeuhoen tuhoe nthn', or giving users options on how much of that stuff they want to sync with, will be a likely next step. There is plenty more we can do here, and I am keen to keep on pushing. I am committed to keeping person-to-person interactions Anonymous in my code–boorus already do username-based curation much better than I have time to do–but I know I can give users more power to share and sync just what is useful to them.

Release Tomorrow! hydrus_dev 10/23/2018 (Tue) 22:09:21 Id: 202b5d No. 10336 [Reply]
I had an ok week. I moved the login manager forward, dealt with all those periodic subscription popups from last week, and fixed up the tag siblings/parents layout. The release should be as normal tomorrow.
5 posts and 1 image omitted.
This information is useful and interesting, but I unfortunately do not have time to go through it in any clever way. I can't work on big things like new duplicate finding algorithms in normal weekly work–it'll have to wait for the next iteration of the duplicate system. I'd also like to prioritise the duplicate processing workflow, which is the larger problem with the system right now. I am most interested in face and other feature detection in cv. I expect to use some of this stuff when we push towards machine learning auto-tagging.
>>10338 In case of ghostbin closing down https://pastebin.com/KPaYiXNM

(20.52 KB 722x449 api-image-for-blog.png)

API Thread Anonymous 02/15/2019 (Fri) 03:58:35 Id: 1dd852 No. 11626 [Reply]
ITT: We propose new features that can be solved by using the API, and recommend new API commends for it
7 posts and 1 image omitted.

The Bisimplex collection Danbooru { "id":3382448, "score":1, "source":"https://i.pximg.net/img-original/img/2019/01/13/00/12/24/72630169_p7.jpg", "md5":"d5a650a70fca03ff391b50fc255dcb26", "rating":"s", "image_width":550, "image_height":770, "tag_string":"1girl :o ahoge bangs blue_eyes blue_hair blue_ribbon blue_skirt braid breasts collared_shirt eyebrows_visible_through_hair hair_ribbon juliet_sleeves kazutake_hazano long_sleeves looking_at_viewer parted_lips puffy_sleeves ribbon shiro_seijo_to_kuro_bokushi shirt short_hair sidelocks simple_background skirt solo underbust upper_body white_background white_shirt", "file_ext":"jpg", "last_noted_at":null, "parent_id":null, "has_children":false, "has_large":false, "is_favorited":false, "tag_string_general":"1girl :o ahoge bangs blue_eyes blue_hair blue_ribbon blue_skirt braid breasts collared_shirt eyebrows_visible_through_hair hair_ribbon juliet_sleeves long_sleeves looking_at_viewer parted_lips puffy_sleeves ribbon shirt short_hair sidelocks simple_background skirt solo underbust upper_body white_background white_shirt", "tag_string_character":"", "tag_string_copyright":"shiro_seijo_to_kuro_bokushi", "tag_string_artist":"kazutake_hazano", "tag_string_meta":"", "file_url":"https://danbooru.donmai.us/data/d5a650a70fca03ff391b50fc255dcb26.jpg", "large_file_url":"https://danbooru.donmai.us/data/d5a650a70fca03ff391b50fc255dcb26.jpg", "preview_file_url":"https://raikou4.donmai.us/preview/d5/a6/d5a650a70fca03ff391b50fc255dcb26.jpg" } Moebooru { "id":276945, "tags":"ama_mitsuki brown_eyes brown_hair gloves hat original panties pantyhose skirt underwear", "source":"https://www.pixiv.net/member_illust.php?mode=medium\u0026illust_id=72633703", "score":12, "md5":"a0f138e4c0e07cf9e643407bf6019d8e", "file_url":"https://konachan.com/image/a0f138e4c0e07cf9e643407bf6019d8e/Konachan.com%20-%20276945%20ama_mitsuki%20brown_eyes%20brown_hair%20gloves%20hat%20original%20panties%20pantyhose%20skirt%20underwear.png", "preview_url":"https://konachan.com/data/preview/a0/f1/a0f138e4c0e07cf9e643407bf6019d8e.jpg", "preview_width":150, "preview_height":118, "sample_url":"https://konachan.com/jpeg/a0f138e4c0e07cf9e643407bf6019d8e/Konachan.com%20-%20276945%20ama_mitsuki%20brown_eyes%20brown_hair%20gloves%20hat%20original%20panties%20pantyhose%20skirt%20underwear.jpg", "sample_width":1100, "sample_height":863, "jpeg_url":"https://konachan.com/jpeg/a0f138e4c0e07cf9e643407bf6019d8e/Konachan.com%20-%20276945%20ama_mitsuki%20brown_eyes%20brown_hair%20gloves%20hat%20original%20panties%20pantyhose%20skirt%20underwear.jpg", "jpeg_width":1100, "jpeg_height":863, "rating":"q", "has_children":false, "parent_id":null, "width":1100, "height":863 }

The Andy iOS collection

Danbooru
created_at: Date string in the format "yyyy-MM-dd HH:mm:ss"
rating: possible values - e, s, u, q
md5: string
height: Int
width: Int
preview_url: URL
large_file_url: URL
sample_height: Int sample_width: Int
tags: STring with tags, each tag separated with spaces
id: Int source: URL, String, or nothing

Moebooru
created_at: (This should return the number of seconds since the epoch and be a number)
rating: possible values - e, s, u, q
md5: String
height: Int
width: Int
file_url: You should try to return the full URL, including the protocol and host name (the app tries to build it when they are missing)
preview_url: Same as above
sample_height: Int
sample_url: string
tags: String with tags, all separated by spaces.
id: Int
source: Where the image comes from. Can be a url, string, or missing
>>12467 >>12468 >>12469 Really? https://pastebin.com/vzQTRFaa since ghostbin is shutting down

local variable 'photoshop' referenced before assignment Anonymous 05/01/2019 (Wed) 15:26:28 Id: 3d3c95 No. 12454 [Reply]
Hey, awesome tool, really enjoying it. I did encounter an error that came up three times when fetching files from Sankaku (out of probably around 10k+ files imported so far, I've only just started using hydrus last week): local variable 'photoshop' referenced before assignment… (Copy note to see full error) Traceback (most recent call last): File "include\ClientImportFileSeeds.py", line 1178, in WorkOnURL self.DownloadAndImportRawFile( file_url, file_import_options, network_job_factory, network_job_presentation_context_factory, status_hook, override_bandwidth = True ) File "include\ClientImportFileSeeds.py", line 571, in DownloadAndImportRawFile self.Import( temp_path, file_import_options ) File "include\ClientImportFileSeeds.py", line 790, in Import ( status, hash, note ) = HG.client_controller.client_files_manager.ImportFile( file_import_job ) File "include\ClientCaches.py", line 1144, in ImportFile file_import_job.GenerateInfo() File "include\ClientImportFileSeeds.py", line 283, in GenerateInfo

Message too long. Click here to view full text.

Thank you for this report. This actually isn't my code failing here, but the image library I use, Pillow. Just guessing, I suspect these files are slightly malformed (broken) and Pillow is having trouble dealing with them. As it happens, I plan to do some cleanup work next week to have a different library, OpenCV, do the initial metadata parsing work here and improve reliability of metadata parsing overall. I will test these URLs once that is done and see if they work. Thank you for the examples!

[ 123456789101112131415 ]
Forms
Delete
Report