/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

8chan Karaoke Night!

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(24.70 KB 480x360 C7dVfQGn-0E.jpg)

Version 352 hydrus_dev 05/15/2019 (Wed) 22:05:33 Id: 6c1b1a No. 12583
https://www.youtube.com/watch?v=C7dVfQGn-0E windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v352/Hydrus.Network.352.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v352.tar.gz I had a good week. .ico files are now supported, 'collect by' status is remembered in gui sessions, and I fixed a bunch of bugs. duplicate overhaul plans I started the duplicate overhaul work this week with some planning and experimentation with existing data. My original thought here had been to exactly replicate existing functionality just with a more efficient database schema, but having gone through the various edge-case insertion and merge operations, I believe the current system is overcomplicated for what we are actually using it for. Most of all, the current system tries to form a chain of 'better/worse' comparisons so all dupes within a 'same file' group are ranked with each other. A decent number of human decisions are needed to determine this ranking, but the data is currently not displayable, and we haven't really noticed that absence. For most practical circumstances, what we really want to determine is what files should actually be considered dupes, and which of those groups is then the best. Most users delete the 'worst' of a pair in any case. Supporting a system that simply tracks a group of duplicates with a single King is more intuitive and reliable, and it is quicker to work with. This 'King of a group' idea also maps nicely to how we use 'tag siblings'–having a complicated tree or chain of worst tag to best is not as useful as simply replacing all the lesser members of a group with the King as needed. When I get to overhauling tag siblings, I expect to make a similar change. But in the meantime, for duplicates, I now have a plan. I expect to spend a few more weeks filling out the full details in code, and then I will switch us all over. The existing workflows should remain the same, just with fewer and easier comparisons. I will not do much specific work on file alternates, but they will be feasible to start on once the db overhaul is done. I will also continue to put time into the duplicate filter ui itself. Overall, I feel good about it all. I'd like the whole thing to be done within 8-12 weeks. otherwise all misc this week .ico files should now be supported! .cur files (which are basically the same) should work as well. 'collect by' settings are now, finally, saved in page sessions! If your default collect by settings include any ratings services, they will be forgotten on update, so you will have to reset them in options->sort/collect this one time. I fixed the stupid issue where media viewer hover windows were popping up over manage tags and some other dialogs. This was due to a flaw in the changes from the new always-on-top duplicate hover panel–I apologise for the inconvenience. Some related OS X specific weirdness should be cleaned up as well. The 'unclose_page' shortcut (default Ctrl+u) now uncloses pages in the correct order! When a media fits the media viewer exactly (so 100% zoom fits the width or height exactly), the 'zoom switch' action (default 'z') now correctly restores back to 100%! 'open externally' should work better for some custom program paths. The flash projector (for .swf files) was opening without an ui, for instance. If you have had other programs seem to open in the background from open externally calls, please give them another go and let me know if they now work for you. full list - the client now supports importing .ico files! (.cur should be supported too) - finally, 'collect by' is saved for sessions! if your default collect by previously included ratings services, it will forget them this one time–please reset it under the options->sort/collect - fixed the issue where the media viewer's hover windows were hovering over child dialogs (manage tags, ratings, or known urls) - improved some os x hover window focus handling for the new always-on-top duplicate action window - the entries on the 'sort by' list on gui pages are now subcategorised better. it should be a bit easier to find what you are looking for - the 'sort by file: approximate bitrate' sort option now sorts still images as well by filesize / num_pixels - to reduce confusion, sort by mime and system:mime are now renamed to 'filetype' - fixed an issue where the 'unclose_page' shortcut was restoring pages in reverse order (unclosing least-recently-closed-first rather than most-recently-closed-first) - improved rigour of video framerate estimation
[Expand Post]- stopped the video metadata parser from opting to manually frame count videos with size >128MB or num_frames estimate >2,400 - fixed the forced manual frame count to deal with frame counts >9999 - the 'ffmpeg not found' error on file import will now put up a popup message once per boot informing you of this problem more broadly and steps to address it - fixed some underreporting issues with subprocess_report_mode - fixed an issue with some yes/no dialogs returning 'no' on escape/window_close_button rather than 'cancel', which affected cancelability some db maintenance questions - fixed an issue where media that fitted the media viewer canvas width or height exactly at 100% zoom would not respond to zoom switch events to restore non-100% zoom to 100% - when a local server's CORS mode is turned on, Access-Control-Allow-Origin is now correctly added to GET/POST requests with an Origin request header - improved reliability of some timestamp rendering code, which should help some users who had trouble opening cookies management page after malformed cookie import - I believe I fixed an issue with 'open externally' on certain custom paths where the external program could spawn without an ui (flash projector did this). please let me know if your 'open externally' calls start making terminal windows everywhere - fixed a runtime stability issue with the new duplicates page and slow-updating counts that come in after the page has been deleted next week Next week is an 'ongoing' week, where I work on a medium-sized improvement to an existing system. I think I would like to put some time into my 'background file maintenance' plans, unifying the current prototype systems into one that runs nicely in idle time and adding some ui controls for it. There are several pending file-reparsing jobs I would like to queue up (e.g. checking mkvs vs webms with modern file parsing code, and apng discovery, scheduling background thumbnail regen, and fixing bad old frame counts and durations), and if I want to integrate videos into the duplicates system, I'll need a better framework here to schedule that retroactive CPU work. Otherwise I have a couple little cleanup jobs to be getting on with, and I'll start some new duplicates db code.
I was the one who asked for collection settings to be saved last week. Really appreciate the new feature. I was wondering if the same could be done for "sort by". Right now, it applies the sort preferences whenever you open a new page but does not apply it when the default session is loaded on startup. I have a library collection of different artbooks in jpg format set to load as a default session. I would like for it to load the cover on startup but right now it comes up with a random page. Hopefully, it would be a very easy fix, and would be good for the sake of consistency.
You added the temp_dir parameter in v344. Can you explain how to use it? Because when I use it like this, Hydrus just doesn't start. If I remove the temp_dir part it works fine. Maybe it's a bug. "X:\Hydrus\Hydrus Network\client.exe" -d="Y:\db_hentai" -temp_dir="X:\Hydrus\Hydrus Network\temp" (this is a windows shortcut btw)
minor. refreshing a page appears to bunk the set order/collect by for the current page, unordering (seemingly random as it's default) - need to manually re-set.
i would love the ability to link/associate a rating to a tag, and have all tags inherit that rating. tier/rank 10 to berserk, for example. first time caller (but not actually). but, just want to gush, hydrus is absolute top tier, much love.
pps. audio support (specifically for videos) would make hydrus the one stop shop for all things.
>>12597 >>12584 Thank you for these reports. It looks like the sort is now not applying right after the collections changes. I will look into this this week and am 99.7% confident I can figure out a better and more reliable way of broadcasting this info to the media page. At the moment it is a bodged two-part call–I'll see about unifying it into one so there is no ambiguity in future. Note that sort is not supposed to apply on loading a session as pages like downloaders and ones spawned from 'show these in a new page' on thumbnail drag-and-drop have arbitrary sort that I don't want to overwrite. Ideally the original 'correct' sort should be preserved through the session save/load.
>>12592 Try two hyphens: --temp_dir="X:\Hydrus\Hydrus Network\temp" Unfortunately, the way I have to bundle the client.exe means I can't print early boot errors back to the console, so I think it was dumping the 'can't parse that error' to the void. Looking at the code, I think I can catch the specific argparse error here better and highlight it to the user through a dialog window. If you like to check what is available, I think the server has the same options atm. Open a command line and hit this to double-check: server --help BTW, I can't remember if I document the temp_dir in the html help anywhere–have I written it with only one hyphen anywhere? I should highlight the full list in the help menu or something.
>>12598 >>12599 Thanks, I am really glad you like it! Yeah, I'd love to have a generalised metadata implication rule system. Some anons like tags like 'medium:webm' just because they prefer a tag to system:filetype. I don't like these tags on the PTR, but if I could write a system where you could say: IF mime = video/webm THEN ADD 'medium:webm' ON local tags It would solve the whole problem, permit me to remove those tags from the PTR in good conscience, and remove the human pain of keeping up with adding these tags. The same sort of system could work for you, linking ratings and tags. I use ratings to help my own processing workflows just because clicking a circle is quicker than typing a tag, but if my rating also meant 'cute elf workflow' on my local tags, that'd help. A complicated rule system is probably far off and a 'big job'. But maybe I can hack in some simpler options. Attaching tags to like/dislike ratings is probably not that difficult. Numerical (starred) ratings is more tricky, but I'll think about it. Audio support is something I'd personally love. It came high up on the last 'big job' vote, I think third. I suspect I will work on it this year. There are two problems I see to overcome: I have no experience with audio code, so I'm a bit in the dark and I'll need to do some research, and the current video rendering pipeline is very hacked-together high-level python code. A dropped video frame here and there isn't a huge deal, but I am not super confident I can get smooth (e.g. no spitting stops and starts) audio playback working on the current system. But I don't know what I am talking about with the various buffers and things I have available, so I am willing to give it a go. At the very least, I know I can add 'has audio' metadata so you can search for it and have a little speaker icon or something on the media viewer so you know if it is worth double-clicking a webm to hear it in an external program. The file metadata regen maintenance pipeline I'll be working on this week will be critical to spending the CPU time to retroactively figure out has_audio metadata.
Hey dev, could you add a limit to the preview and media view time tracking? I currently have 20 days of preview viewing time and 35 minutes of media viewing time, because I can't be arsed to make sure to unselect every time I minimize Hydrus and go do something else. I open it a few days later to check the new stuff from my subs, and Hydrus dutifully records that I've been looking at a single preview image for 3 days straight. Would just ignoring preview viewing sessions longer than 30 minutes be a reasonable fix for this?
>>12605 As the guy who added "Medium: webm" tags to everything, the main reason i did that was because i noticed that some webms were tagged wildly incorrectly. (Putting anime tags on an eric andre webm was the example I remember, but there might be others im not aware of) Either that was an intentional act of malice, to sneakily and maliciously tag a poor innocent eric andre webm as a lewd anime pic, or at some point the ptr points to two different pieces of media and assigns them the same hash and therefore the same tags, something that is inevitable with a software as all-encompassing as hydrus. In short, the pidgeonholing problem. Given that there's no way to send images and video through hydrus, to show that you are tagging these images in good faith, it seemed prudent to add some simple, automatically gathered bits of information on a certain piece of media and add it to the PTR. That way, anyone tagging a picture and seeing "Medium: Webm" as a tag will know something somewhere is fucky, and will hopefully complain about it here. That-s not going to be solved with a generalized metadata rule system, since such a system will be automated and will therefore not spot these errors that a human might see. It's still a good idea, just not fit for what you see is fitting.
>>12609 >maliciously tag a poor innocent eric andre webm as a lewd anime pic I think a more common reason for this is stuff like grabbing threads and tagging all the images after the thread theme, accidentally tagging reaction images because they didn't double check.
>>12603 >BTW, I can't remember if I document the temp_dir in the html help anywhere–have I written it with only one hyphen anywhere? I should highlight the full list in the help menu or something. I dunno, I only know about it from your changelog where you didn't mention any hyphens. >added temp_dir parameter to the client and server that will override which temporary directory the program will use And "-d=" uses only one hyphen? Why the difference? But yeah, adding a full list somewhere would probably be a good idea.
>>12609 >the ptr points to two different pieces of media and assigns them the same hash You shouldn't worry about that. It is 45 times more likely a rogue asteroid crashes on Earth within the next second, obliterating civilization-as-we-know-it, and killing off a few billion people, than two different files getting the same SHA-256 hash. :) https://stackoverflow.com/questions/4014090/is-it-safe-to-ignore-the-possibility-of-sha-collisions-in-practice
Good work so far dev! I'm liking the always-on-top tagging box and looking forward to tournament-style elimination of inferior duplicates. I don't want to add tags to the PTR on download anyways. What I'd like is if I could sort and tag files to my heart's content, then add a batch tag to them to show they were ready to add to PTR, then I could look up that batch tag, select all, remove the batch tag, hit a button to automatically copy over all of my local tags to the PTR, then add a new batch tag to local tags only showing that those files have already had their tags sent to the PTR. It would be nice if this could work on a trust system where I can auto-clone say 2 images' worth of tags per day at first, then once someone reviewing the PTR approves my tags as "well, decent and useful enough" (not having to be perfect according to their personal tagging standards, just decent and not too spammy or inaccurate), maybe that cap moves to 4 or 50 or whatever, and you can work your way up from there. That way people could get the convenience of automated tag cloning to PTR without everyone having free reign to spam it, and there'd be some way for careful taggers/curators like me to share my finished work to the PTR other than rewriting every tag for every file. ALSO what is probably not feasible but would be cool is if the PTR could somehow track some common duplicates and show tags for dupes as well as the file itself.
Ok hdev, ran into an issue of sorts that has a fix/work around, but im wondering if something more could be done. so since the whole delete labeling happened, I have been going through the large files, so far I have gotten rid of at least 40gb of files (and quickly replaced the files with more data) but I cam coming to a close on the 10mb+ files, and I ran into a problem. if I search the archive for new 10+mb files, I will get any file I decided not to do something about, and would be re doing it again. the first thing I can think of is a rating or a tag that either denotes to keep the files, or this is the result of dealing with a file, this would keep quite a bit of bullshit out of a new search… however this requires me to add another filter whenever I do something like this, a dealt with/keep original stipulation. so i'm wondering if there is an easy way to add something files that would have a user definable implicit file search? Personally I see use in this just to make reviewing large files easier without making an oopsie which would more likely piss time away then anything else. I'm also wondering if there could be a way to hotkey move files to different tabs. I am doubting that currently it would be easy to set up a multi hotkey bind, bould would it be possible to select a tab, 'highlight' said tab, and whenever the hotkey was pressed, the file would move to said tab? it would help quite a bit with en mass file parsing.
>>12607 Sure, thank you, this is a good idea. I think I can retroactively compensate a little here as well, and truncate any current preview viewtime for any media to be (x minutes * current view count). Thinking about this, to me 30 mins seems high as a cap. Maybe 10 mins? I don't do a lot of staring at the preview screen personally and have preview stats hidden on my IRL client so I am not sure of the workflows here. Would you ever legit look at a media for more than 10 mins in the preview window? Is there even value in more than 2 minutes? Maybe for a long gif/webm. I suspect there would be disagreements here, so I won't pull the trigger on this this week. Maybe I should add an option for max track time in media/preview and then have it truncate on setting that. 30 mins seems a fine safe default for that, but I'm interested if anyone disagrees strongly.
>>12609 Thanks. I've previously had users tell me about preferring to search with tags rating than system, so it is interesting to hear your view as well. As for bad tags, in my experience from the development and admin-side, almost every single incorrect tag I have come across has been due to scripting or typing/selection mistake or ESL (English as a Second Language) issues. There's a few jokes in there that I've vacillated on removing when I stumble across them, but those are overwhelmed by some lad accidentally applying 'character:pencil drawing' or 'imageboard conversation' to 3,000 images by mistake, usually through an automatic system that is misfiring. And disputes over tag siblings tower over all that (I still get pussy->vagina and vagina->pussy proposed every other week or so, and refrain due the outcry last time), so figuring out a way to let users easily choose their preferred sibling is likely my next priority in this area. Btw, as >>12617 points out, you don't have to worry about hash collisions for hydrus. If you check the table here: https://en.wikipedia.org/wiki/Birthday_problem#Probability_table And look for the row for 64 hex characters (for SHA256), you'll see that a hydrus file domain needs 1.5×10^34 files before there is even a one in a trillion chance of a hash collision. With current tech and cracking methods, it seems to be impossible to even intentionally cause a collision with SHA256, as is now possible with MD5. You can be very very certain that a hash in hydrus applies to the same file, and we should be good for a while!
>>12616 It is an old Unix style in which single characters have one hyphen, and words have two. Often the single character switch has a word synonym as well (in hydrus I set -d to be the same as –db_dir, and the default -h usually gives the same as –help). I think the double-hyphen for words came about because some parsers do a variable right after the single character, like '-qH' for something like 'high quality', and so having '–quality high' or '–quality="high"' with double hyphens helps to reduce parsing ambiguity. Whatever the case, I'll make this stuff more visible. It is easy to forget what is important to highlight when I wrote the entry myself.
>>12618 Thanks. When I first started out with hydrus, I thought I would spend more time on the client-server interactions as you suggest here, with more admin controls and weighting/reputation of user submissions and so on, but it mostly, and moreso in the past couple of years, got swamped by clientside-only improvements to the downloader and so on. I suspect I am biased a little as well, since I'm not an admin/mod by nature so I don't feel much personal draw to add those capabilities to the PTR specifically. I have a high tolerance for the funposting and mess that comes with Anons, so I am comfortable with how undisciplined it is. That said, I don't rule those sorts of workflows out for future iterations of the server. It is a bit of a pain in the ass to run (it is all debug-tier interface, since I am the primary user), but you might like to look into running your own tag repository. You can curate your own garden of tags exactly how you want and share them with zero to n friends with read-only or full write capability. If you do so, let me know how you get on and what user-management tools and better commit workflows you find you would like in practise, and I'll put some time into it. If you would still rather interact with the PTR, just with a better workflow on your end with your batch-with-a-tag idea, you might be able to patch something together right now with the 'advanced content update' dialog, which you can launch from the manage tags dialog under the cog button (you need help->advanced mode turned on). You might be able to queue up some files for submission, load them in a page, remove that marker tag, and then open the advanced content update dialog on the 'local tags' page and go 'copy current and pending mappings' for these 50 files to 'PTR'. Again, this dialog is really just debug tools that I have put just a bit of time into to make slightly friendlier, so if you end up using it a bit more, let me know you it goes.
>>12630 I assume you have already done archive/delete filter on these files, so they are all in the archive? Otherwise I would suggest using inbox and archive to track files you have processed or not. You can send files back to the inbox for another round of archive/delete processing with Shift+F7, btw. I have a couple of like/dislike ratings on my client that I use as pseudo inboxes. One is called 'read later' that I apply to anything I am archive/delete filtering that is like a big thread screencap or a 15 min video on some bullshit that I don't want to deal with in a quick filter. I enjoy like/dislike ratings since I can just click to quick apply and have some shortcuts set up for this as well. Maybe you could make a new like/dislike rating called 'videos to process' or something with some shortcuts to like/nullify to help you keep track of all this? Mapping things to tabs is tricky. I'll be doing some Client API tab-access work soon, so I'll be thinking more closely about this in a couple weeks. The tricky part is mapping the action to the tab–'send file right one tab' is fairly easy to manage, and has a simple yes/no error state, but 'send file to page named "sexy elves"' is just that step more difficult to track and action and fail with (e.g. there could be two pages with that name). I like the idea of setting a current 'destination' tab in the current session that you could then action. I'll think about that. I've thought maybe having a shortcut or action for 'suck up these thumbs into the mouse' and then 'spit those thumbs back in here', like in cut and paste terms.
>>12636 Oh, well that's good to hear. I don't have any problem with the PTR, I just thought you did since people talk a lot about tags in it and I remember there was a lot of complaining a year or two ago about wrong tags. I just use it as an aid to my own tags to sort stuff, so I'm happy if you're happy. I do want to try running my own PTR and file server with Hydrus if I ever get the money and time, so if I do I'll give feedback. As for that workflow idea, I'll give it a try with some files once I finish with my duplicates.
>>12633 I suggested 30 minutes as the obviously safe option, where no one could possibly watch a preview for that long, but the majority of AFK sessions would be longer than that. I wouldn't expect to see much reduction in sessions pruned by going with 10 minutes, but who the hell is legitimately watching the preview for 12 minutes and wants that time recorded? Honestly, I'm not convinced this needs to be a setting. 10 minutes is way more than almost anyone will ever watch in a thumb. 60 minutes would cut out the vast majority of AFK sessions. It's such a broad range to work in that 30 minutes would do the right thing in 95%+ of cases (and I'm being cautious with my numbers), so why add more clutter to the settings for something almost no one would need to change?
(8.40 KB 815x24 media viewer.jpg)

>>12633 >>12654 Could any potential solution for this be extended to the media viewer? This seems like the same problem of getting distracted during, say, an archive/delete session, and leaving one image open (but minimized) for a few hours. I've had this, but also a similar problem with the media viewer (pic related). I basically had a bunch of 'alternate/duplicate' files, and I was flipping between all of them hundreds of times trying to decide which to keep, tags etc. Is it possible to detect if the client doesn't have focus, and stop tracking immediately? I feel like that would solve 90% of issues right off the bat. Otherwise, tying it in with the client's inbuilt 'idle' state could also work well (and is already user configurable in options -> maintenance and processing).
>>12656 >>12654 Thanks lads, only catching up now. This isn't in yet, but I'll keep this in mind as I do this system. I'll do min/max for media viewer as well. Options are easy to add, and I know someone will want to say 'no minimum time' or whatever, so I'll throw them in for anyone who is interested. 5s/10m sounds like an ok default min/max. I think I can catch a focus lost event, although some of that stuff is a little unreliable. I'll play around with it and we'll try iterating on this a bit.


Forms
Delete
Report
Quick Reply