Heatmaps Rollout and Heatmap Styling Poll

This depends on how the lines are represented, if its absolute lines, hand crafted scripts will show peaks at the 10 intervals and barely anything lower on. This would end up looking more like:
asdasd

This is a meaningless view when we compare it to something that a generated script could generate:
afbeelding

This view is much clearer. But the question is mostly how this ‘evenness is created’. We could make each point look at values 10 above and below and average it there, which means it always includes the 2 most common values (the 10 intervals). But this could generate a very flat graph.

On the other hand, even if its that flat, if its a relative view between lowest and highest, it can still show a trend. And thats why i think such curve is better than absolute values.

But at least there is an easy method to measure. You keep track of all 0-100 positions, and each time the device crosses/stops at such point, add 1. In the end you get a value that represents activity. Sure, they can reach values up to 2500 easily in a 3000 node script, but thats fine. Its not going to exceed integer limits or go excessively high.

We then look at the lowest, (lets say that was 300), and remove this from all points (sets the 0 point for relative display). Next, we check the highest (lets say 2500), and we want to take this down to the limit value (lets say 100), and we then scale all points equal to that (val x 100/2500).

And now we can just display those values in a graph. There are not going to be peaks unless a vibration pattern happens to last long at for example 40-50. But we can counter this by using something like the OFS simplifier algorithm when detecting 10 consecutive strokes that each have a gap of 100ms after each other. This kills vibration patterns as it essentialy just removes all its points.

We could even intruduce the aspect of time in this. where slower strokes generate a higher value to add (1 second gives 100 weight, while 0.5s only gives 50).

Its highly unlikely we are going to run into integer issues if we only apply simple weights. And probably floats still provide enough precision (and i do consider that php is bad handling this! thats how little of an issue i see here).

The end result graphs that i expect will then be like:
asdasd
Where in this example there is some base action going on, a little in the mid, but most action is at the tip, with again a peak at the upper point. The longer it stays at a region, the higher the bump. Note though, that even if the lower portion barely shows action, still over 50% of the strokes could pass it. Thats because we amplified the curve.

I have done a bit of analysis with a corpus of ~2000 scripts, including all of the ones from here. My approach is quite simple, using some basic Python code:

  • Use json.load() to read the the file from disk
  • Extract just the actions from the resulting dict - we don’t care about the metadata - and save them as a Pandas dataframe. From this point it’s quite simple to calculate metrics for each script. This thread is quite useful for speed calculations.
  • For calculating the compression ratio:
    • Convert the timestamps to time deltas - many scripts don’t start at the first frame, and the timestamps contain a lot of redundant information that could bias the compressibility.
    • Export the time delta and position columns to csv (without an index column or header row)
    • Zip the resulting csv file, using the highest level of LZMA compression, and take the ratio of the zipped csv file size to the original file size

Although most of the values fall in a fairly narrow range, there is quite a bit a variation in compression ratio (capped at 0.30 in the figure here for readability):

At the lower end, there tend to be more beat-based CH-style scripts like this, with only occasional changes of stroke speeds and stroke lengths:
https://discuss.eroscripts.com/t/ch-red-light-green-light-softcore/74604

In the middle, around the 0.06 mark, there are scripts like this one, which have a bit more variety but which still contain some fairly long regular sections:
https://discuss.eroscripts.com/t/cock-hero-dream-girls/13893

Extremely short scripts (less than 1 minute) don’t compress very well due to the overhead in the compression algorithm. If we ignore those, then at the upper end of the scale are scripts like this one, which tend to be slower, more action-based, and have quite varied stroke speed and stroke length:
https://discuss.eroscripts.com/t/stop-and-go-take-two/52840

There are many other metrics you can look at, and each one favours different kinds of scripts. E.g. stroke length standard deviation has a distribution like this:


The big peak at 0 consists of all the fap hero scripts, which use full strokes throughout. At the very top end is this one, which makes extensive use of little bounces at the ends of longer strokes:
https://discuss.eroscripts.com/t/katesplayground-nirvana/138806

Turning it down a bit to about 35, you get mostly PMVs, e.g.
https://discuss.eroscripts.com/t/thots-vs-e-girls-script-request-fulfilled/153162

In theory, it ought to be possible to calculate lots of metrics like these and feed them into some sort of recommender system, which would make it easier for site users to discover new scripts in line with other scripts they liked.

Tbh, short/long is still a metric for scripts that can be used. While comparing them towards each other isnt realy usefull, comparing short with only short is. And even here you can get a standard deviation on compression as the curve should generaly still be similar. For both short and long you can calculate a median. And using a formula you can even compare scripts with the median in mind.

So lets say short gets a 0.25 and long has 0.10 as median value, if you would multiply the short video with 0.4, it should line up somewhere near the value for a long script, allowing them to be compared to quite some decent degree. It might need a constant to refine the value a bit (so maybe short scripts only need multiplication by 0.3 or maybe by 0.5, but it can get them a lot closer together)

Also, how do you define stroke length? is that from full up to down? or do you consider each node? As many nodes between a single stroke doesnt always mean more action, it just means more precise action which an OSR2 benefits from (while a handy often doesnt).

For now, I’ve used a very naive definition of stroke length - simply the difference in position between actions. This will tend to understate perceived stroke length, as it’s counting successive actions in the same direction as separate strokes. I’m sure there are more complex and meaningful definitions, but I think it’s clear that even with this very basic one, it’s easy to distinguish between different kinds of scripts.

I’m guessing this is based of of HereSphere’s heatmap configuration. I’ve been trying to dial in the maximum speed for the Handy recently and for a hard mode project I’m working on I decided by bump up the max speed to 469 (It’s a funny number I’ll remember, and OFS rounded to it.) I think it’s worked out really great as at that speed, with vibrations, the vibrations still work unless I’m using a very tight sleeve adding a lot of drag.

So for scripting I agree that 580 is too fast and sort of not making good use of the spectrum of colors available to you. That said I think this color range is useful for downloading and consuming scripts so you can see how “bad” the script may be, or give you an idea on how you can tweak it with some max settings in MFP or Script Player, etc…

More polishing and multi-axis merging for those who want it