User 128511, Deleting Stack Overflow Account

2024-11-02

3 years ago I said I was done answering questions on stack overflow. But then, in January 2023 I started working on WebGPU and thought I'd help out by answering WebGPU questions there.

Of course, it was only a matter of time before I ran into all the same types of issues I'd mentioned before.

So, this time, instead of just stopping, I'm deleting my account. I no longer want to be abused by stack overflow's moderators. They don't value the contributors. Why should I give my time freely to them? I'm done. If you want answers from me for WebGPU, please use the WebGPU discussions on github.

Comments

Zelda, Tears of the Kingdom - Disappointing

2023-06-10

As I've posted before, the Zelda series is my absolute favorite game series (console game only) and Breath of the Wild (BotW) is my favorite game of all time. I'm not saying the game was perfect, it definitely had it's share of issues, but overall, the amount of joy I got from that game surpassed any other game I've played.

So, of course I was super excited when the next Zelda game came out. Zelda: Tears of the Kingdom (TotK).

Wellโ€ฆ Sadly, so far, it's been a huge disappointment. I'm 60 hours into the game. Was going to wait until I finished because maybe by the end I'll have changed my mind. But then I though, no, I should write what I'm feeling now. Regardless of how well it ends I've got lots of time in the game already and want to record how I felt for 50 or so of the last 60 hours.

No Joy and Wonder of discovery

Probably the single biggest disappointment is TotK takes place in the same world as BotW. I know lots of fans like that, but you can read how disappointed I was with that in A Link Between Worlds. Lots of people love that about both games. Me, it robbed me of the #1 joy I got out of BotW, that is, discovering new places. Almost everywhere I go in TotK I've already been there. The joy of discovery is removed. I remember playing BotW and climbing a mountain and feeling wonder at seeing the spiral Rist Peninsula. I remember seeing Eventide Island the first time and thing OMG! I can go all the way over to that island!? I remember the joy I felt the first time I crossed The Tabantha Great Bridge and saw how deep the canyon was. I remember the first time I discovered the massive Forgotten Temple. And 30-50 other just as "wow" and wondrous moments. The first time I saw a dragon. The first time I saw a dragon up close. The first time it wasn't raining near Floria Bridge. The first time I saw Skull Lake. The first time I saw Lomei Labyrinth Island. The first time I saw Leviathan Bones. And on and on.

All of that joy is missing from TotK because I've already been to all of these places. There's a few new places in the sky, but so far, none of them have been impressive.

Building is both a great idea but also a chore

In TotK you can build things. They took the Magnesis power from BotW and added that when you move one item next to another you can pick "Attach" and they'll get glued together. The difference then is, in BotW you'd be near water that you want to cross, you'd have to go find a raft. In TotK, you instead have to go find parts. For example, find 3 logs or cut down 3 trees for 3 logs, then you can glue the logs together, now you have a raft. It takes a couple of minutes to build the raft. This makes TotK more tedious than BotW. I didn't really want to build the raft, I just wanted to cross the river. Being able to build things is a great idea but it's also unfortunately a chore. Maybe I'm not being creative enough but mostly it's pretty obvious what to build and how to build it.

There is no guidance on direction

In BotW, after Link gets off the plateau, it's suggested he should go east. The enemies and things encountered in that direction are designed for the beginning player. Of course the player is free to go anywhere, but if they go in the order suggested they'll likely get a better experience as enemies will be weaker, shrines will have stuff that trains them. Etc...

In TotK, unless I missed it, no such direction happened. I ended up going to Rito Village first because that was what some character in the game suggested. 30-40hrs in, I was going east from the center of Hyrule, and it's clear the designers wished I'd gone that way first as the training shrines are all there. Training you to use arrows, training you to parry, training you to throw weapons, etcโ€ฆ It's possible I missed the hint but it feels like there was no guidance suggesting I go that direction first.

Hitting Walls

I cleared 3 bosses (Rito, Goron, Zora) with no pants. Why? Because I never found any source of money in the first 20-30 hours of play so all I could afford was cold armor (500) and cold headgear (750). Pants cost (1000). Later I needed flame guard armor and had to use all my money to buy just the top. I didn't have enough money to buy pants, nor did I run into any source of money that far into the game.

Here's my character, 60 hours into the game!

too poor for pants

Another wall came up when I went to my 4th boss (Gerudo). The 2nd phase of the boss way too hard. I quit the boss, went and made 25 meals for hearts, and even then I could tell there was no way I was going to beat it when 12 or so fast moving Gibdos each take all of my hit points with 1 hit.

After dying too many times I finally gave in and checked online, the first time I'd done so. According to what I read, my armor isn't up to the fight. Now I've spent 30+ hours trying to upgrade my armor but it's a super slog. I need to unlock all of the fairies. Each one requires it's own side quest or 2. Once I've unlocked them I have to go item hunting which will be another 10+ hours. I actually have money now (~3000) and lots of gems but I know no where to buy good armor. I found the fashion armor. I got some sticky armor. But I have yet to get any of my armor upgraded more than 2 points past what it was 30 hours ago.

armor collection 60hrs in

Fighting is still too hard (for me)

This is my collection of weapons 60hrs in!

60hrs weapons

This is my collection of shields.

60hrs shields

Where are the weapons!?!?!?!? The shield collections looks ok for 60hrs but the weapons do not. Where are they?

I complained about the fighting in BotW. I found it not as fun as previous Zelda games. Fighting in TotK hasn't changed so that's the same. I get that I suck at it because I can watch videos of people who don't. But, for whatever reason, unlike every other Zelda, I've never gotten the hang of fighting in either BotW nor TotK. As such, I avoid fights as much as possible because basically the odds of me dying are around 1 out of 3. Especially if the enemy is a Lizalfos. They run fast, they take my weapon and/or shield leaving me defenseless.

Taking on a single enemy is something I can often handle but taking on 3 or more I'm more often than not going to die.

I complained about this in BotW as well. I wish there was a combat trainer in some village near the beginning of the game. He'd ask if you want to be trained and you could pick yes or no. That way, people who hated the mandatory training from previous Zelda games could skip it, but people like me, who want to train in a place where you don't lose any hit points and never die, would have a place to learn how to actually fight.

In BotW I basically avoided as many fights as I could and skipped all the shrines with medium or hard tests of combat until after I'd finished the game. In TotK it's been similar. I'm avoiding fights for the most part.

Surprisingly, in both games, the bosses (well most of them), were easy or about the same level as previous Zelda games so it's super surprising that combat from random monsters in the world is so friggen difficult.

The World of TotK is not Interesting

The world of TotK is not as interesting as BotW. Yes, it's the same map but things have changed.

In BotW there were signs all over the world of ancient times, ruins, fields of dead guardians, it felt epic. In TotK the world is covered with rubble from some sky people's world falling down. For whatever reason, I'm not finding the TotK world compelling.

Location of some epic ancient battle in BotW

In BotW I'd come across a field of broken guardians next to a large thick stone wall. It was clearly the site of an epic battle. Stuff all over BotW's world suggests the place has history. Nothing in the world of TotK has made me wonder anything at all. The idea of a Luputa like civilization in the sky is interesting but nothing about the world presented in the sky in TotK suggests anything interesting actually happened there. Instead it's all just stuff designed around gameplay, not around what a civilization in the sky might be like.

It was a mistake to use the same world as BotW in that there's no consistency. Of course Zelda games have never been consistent but also, except for "A Link Between Worlds" (which I was also disappointed with), no Zelda game has had anything to do with any other Zelda game.

TotK though, because it's in the same world and because that world is so detailed, it arguably needs more consistency. All of this talk of a world in the sky that's always been there and is the source of the clean water in Zora's Domain, etc does match BotW. The fact that all the old shrines are gone but have magically been replaced by knew ones yet Kakariko Village and Hateno Village are basically unchanged makes no sense. Of course, going from the first principle (no Zelda's share anything) it doesn't matter. But, the fact that this Zelda is the same world, Zelda even references Link saving Hyrule previously, means that all those inconsistencies are highlighted. If they'd just made a new world that would disappear.

Dark World

First off, what do these 6 pictures have in common?

chasms

Now look at this

NOT A CHASM!

During my first 60hrs, I saw the red gloom covered pits, always from a distance, always from ground level. I thought I was supposed to avoid them! Especially because I thought they were the home of these

Scary OMG ๐Ÿ˜ฑ!

Those gloom hands are super scary. The screen changes color, the music gets super tense. As soon as I ran into one I beamed out! So, I avoided these gloom covered holes for fear gloom hands would come out.

Some characters seemed to suggest I should check out some "chasms" and so I kept wondering when I'd run into a chasm knowing that a chasm looks like those 6 examples above, not a pit/crater/hole. In fact there are at last 4 chasms in BotW. Tanagar Canyon, Gerudo Canyon, Karusa Valley, Tempest Gulch. All of those are chasms.

At the 60hr mark, I finally decided to check online, where could I find weapons. The first post I found said, inside the "Hyrule Field Chasm" and marked it on the map. I'm like WTF? There's a chasm there? I go look and find it just one of these pits, not a chasm. So yea, because of poor localization or because the translator didn't bother to look up what a chasm is, the "chasms" are mis-named. ๐Ÿคฌ

pit/hole/sinkhole/abyss

I was kind pissed off I'd missed this for 60hrs (though I had been in the one from the Goron boss, which to be honest was the only "wow" moment for me in the game so far). I was wondering when I'd find other entrances, especially since someone gave me a map marking some spot in the dark far west from Death Mountain. Now I knew.

On the other hand, I was excited, hoping this was where I'd find the things I'd been missing. Namely, discovering interesting places that filled me with wonder.

Well ... after 10hrs of exploring, no, the dark world doesn't provide what I was missing. In fact, it's super boring!

I literally spent 6-7 hours just trying to find anything interesting, going from lightroot to lightroot. This is what I opened

boring

That entire area had nothing. 5 or so hours in I saw on the map there appeared to be something of interest at the far north but I couldn't find a way to access it. I tried diving into the pit under Hyrule Castle but I didn't find a way to the stuff on the map, even though it marked me as just north of it. I eventually gave up on that. I eventually found some stairs with flames and was hoping it was a temple or dungeon. No, it was just a place to use "Ascend" and deposited me on a tower at the Bridge of Hylia.

At the 6-7 hour point I finally found "Autobuild" and thought maybe that would open something new. Nope. The characters that gave it to me pointed some direction that led to some mine carts. I explored them but found nothing. I spent another couple of hours opening more lightroots and still nothing.

This includes an hour or so of "grinding" since I ran out of arrows and all of my bows broke from shooting giant brightbloom seeds. I know Zelda has always had some amount of grind but it feels worse in TotK, probably because I'm not enjoying the game. First I needed to go get money, then I needed to buy arrows, then I need to find bows. So yea, about an hour.

The dark hasn't saved TotK for me, in fact it's had the opposite effect. I like it even less given how boring the dark has been. It's like some bad filler content.

TotK has bad writing

Zelda games have never had a ton of story. They're all about the game play. But, TotK has the worst so far. Let me put that another way, TotK has an interesting story premise. It's just that individual parts make no sense.

In one scene, Ganondorf appears and magically stabs someone in the back. The fact that he could do that invalidates all his other actions and the rest of the story. If he can just magically kill anyone then he should have killed Zelda and the King and everyone who stands in his way.

The scene where the Queen says Zelda is hiding that she wants to help is some of the most silly childish writing ever.

The scene where Ganondorf appears before King Raura, Queen Sonia and Zelda pledging allegiance, doesn't seem like it makes any sense, Zelda is from the future and knows who Ganondorf is, so her reaction to seeing him (not sure she trusts him), makes no sense. She knows exactly who he is.

Good things of TotK

Things I like about TotK.

Disappointing

I've thought about quitting and not finishing TotK. That's a first for me in a Zelda game. Again, it's my favorite video game series. The only amiibo I own is a BotW guardian.

I have Zelda fan art posters on my walls

I even have Zelda key chains

and Zelda coasters

In other words, I'm a huge Zelda fan, not a hater. It's really disappointing to find I'm not enjoying TotK as much as I had hoped.

At 70hrs, which is probably the 3rd most I've played any game ever (BotW being #1), I think I'm done. I want to see the end but I'm sick of just grinding, trying to find armor so I can survive a boss fight. I can go dive in some other pit but if it's just more grinding from lightroot to lightroot what's the point?


Thoughts after finishing 15 days after I wrote the stuff above.

According to my profile I "Played for 105 hours or more" so that's 35 hours more than when I wrote my thoughts above. Those 35 hours felt like another 70 and I'm actually surprised it claims only 105 hours given it's been two weeks but whatever ๐Ÿคทโ€โ™‚๏ธ

Dark World

So, apparently I didn't need to check the "chasms". Some time after I got the Master Sword, a character told me to follow him down one of the "chasms" and that led to the things you're supposed to do down there. In other words, the 10 hours I spent trying to find anything down there were mostly pointless and my experience would have been better if I'd not looked online and not followed the advice to go into into a "chasm".

Still, I did feel like the dark world is mostly filler. Unlike the world above which has snow areas, mountain areas, forests, jungles, beaches, clifts, deserts, etc... The dark world is pretty much the same all over. Once the characters told me what to do down there it wasn't nearly as tedious although I'd already lit up many of the places they directed me to go.

Mineru

Mineru's addition seemed wasted, or else I didn't figure out how to use it. For something so late in the game with so much flexibility, it seemed like it might add lots of new and interesting gameplay, but in the end I mostly ignored it. I'll have to go online to see what I missed I guess.

The Story

While I had lots of issues with the details in the story and how much of it didn't make any sense, including Ganondorf's last act, I did end up enjoying Zelda's arc. That part was good.

Too Hard

I still found it too hard. I spent I think literally a week or more trying to beat Ganondorf in the last boss fight.

First, after a few tries, it was clear to me I didn't have enough of the right meals to survive so I beamed out. That means you have to start the entire sequence over, fight your way into the boss area, go through 5 waves of Ganondorf summoning swarms of enemies, before you can get back to the main fight. The whole thing felt so tedious to me, spending several hours getting the right ingredients to make the types of meals needed to survive and getting the gloom resistant armor and upgrading it. I only managed to upgrade it once per piece as looking at the requirements for twice would have easily required another 3-5 hours of nothing but battles with giant gloom monsters in the dark ๐Ÿ™„

One you're actually fighting Ganondorf you're required to Flurry Rush him which you can only do after you execute a Perfect Parry or Perfect Dodge. Again I'm going to complain that I wish there was a place you could choose to train that was like older Zelda games where some teacher would tell you exactly when to do the move and not let you out until you'd done it several times but at the same time actually let you practice quickly.

As it was, I had to learn by fighting Ganondorf 60+ times and it felt like ass to wait for the death screen, wait for the reload, etc. After a few times I'd get frustrated, feel like throwing my controller through my TV, and so quit the game and wait a few hours or the next day to try again. Worse, in Ganondorf's 3rd phase, you have to Perfect Parry/Dodge twice in a row and I could rarely do it.

In the final battle where I beat him, I made it through the first two phases without taking a single hit. In other words, I'd learned to correctly Perfect Dodge. But, on the 3rd phase it was still super frustrating I couldn't do it in this phase and he'd hit me 4 out of 5 times and only 1 out of 5 would be able to do the double Perfect Dodge. Even a single Perfect Dodge was hard. The point being I needed a place to train so that this battle felt good. I never felt like I was doing it wrong since I was doing it exactly the same as the previous two phases. Rather, I felt like the game wasn't making it clear what I was suppose to be doing. When I managed to pull off a Perfect Dodge it just felt like luck as to me it felt like I was pushing the buttons at the same time every time.

Building

Once I'd beaten the game I went back in to check a few things I still had marked on the map. I checked out a couple of sky places I'd never been to and for one, the only way I could see to get to the top was to build a flying machine.

Watching some videos it's clear I missed quite a few interesting things I could maybe have built? On the other hand, many of them are things that don't interest me. I had this same issue in BotW. There wasn't building but there was physics in BotW and watching videos of creative ways I could attack groups of outdoor enemies using these techniques was interesting. The thing is, I didn't want to fight the enemies, I wanted to "continue the adventure" so taking the time to setup some special way of attacking enemies just felt like a waste of time. I'm not saying others shouldn't enjoy that activity. Only that I didn't enjoy it. My goal wasn't to fight as many enemies as possible, it was to go to the next goal, discover the next interesting place, advance the story. Except for bosses and enemies in dungeons, the outdoor enemies are just things in the way of what I actually want to do.

There's some crazy contraptions people built in that video above. It's just that building those contraptions doesn't advance me toward completing the game.

Final Thoughts.

It's hard for me to say what I'd feel if I'd never played BotW and only played TotK. I still feel like BotW is a better game even though in way TotK is all of BotW plus more.

I think the issue for me is, BotW was all about discovering the various areas of Hyrule. For me, discovering each area was 60-70% of the joy I got. If I'd never played BotW, maybe I would have enjoyed TotK more, but, the game feels designed for people that played BotW. I feel like BotW was designed to get you to explore the world, by which I mean, based on what the characters you meet tell you, you end up wanting to go to each place. In TotK I feel like that's less true. It's hard to say if that feeling is real or it's only because I've been to all these places already in BotW.

Partly it's that TotK is 1.8x larger than BotW so if they'd directed you to explore all of the BotW parts the game would be way too long. Instead they mostly just direct you to visit some parts plus much of the new stuff and leave the rest as random playground.

In any case, BotW is still my favorite Zelda and TotK, while it had a few great highlights, is much further down the list if was to rank every Zelda.

Here's hoping the next one is an entirely new world.

Comments

ChatGPT in 1957

2023-02-11

ChatGPT has been all over the tech news for the last couple of months. Well, imagine by surprise when I recently watched this 1957 movie that features ChatGPT

In the movie Bunny Watson (Katherine Hepburn) runs the research department of some corporation. Other divisions call the research department anytime they need info like "Give me 5 interesting facts about Kenya?". Or "Who is the head of India and what is their title?". The research department gives you the answers or if they don't know, they will go research them and get back to you.

Richard Sumner (Spencer Tracy) is a computer engineer who's been hired to install a computer that will give these types of answers. Once they turn it on it works exactly like ChatGPT. You ask it a question in English and it gives you a few sentences worth of answer.

That pretty much the exact experience of using ChatGPT was in this 66+ year old move and that I happened to watch it in December 2022 around the time ChatGPT came out was a really interesting coincidence.

I'm a fan of several Katherine Hepburn, Spencer Tracy movies but I can't fully recommend this one. Still, others loved it more than me so if you want to watch it it's available on Amazon.


Also, There's a few famous, good, relatively recent AI movies, "Her", "Ex Machina". Some people consider The Matrix an AI movie though it's arguably more fantasy. The base of the Terminator movies is about AI trying to kill humans.

But, if you've never seen it, my favorite AI movie of all time is still

"Colossus: The Forbin Project"

I was going to post a link to the trailer but IMO all of the trailers have too many spoilers. Not that you can't imagine what will happen based solely on the premise. The United States government designs the most powerful computer ever, to run the nation's defenses. What could possibly go wrong? Still, there are a few twists the trailer spoils so avoid it and JUST WATCH THE MOVIE!. It's only 1hr and 40 minutes and it's pretty taut all the way though.

Sadly I don't know where to recommend watching it. It's not on Amazon Prime.

ATM it is here on Vimeo and here on the Archive. I have to believe both are illegal uploads since it's a movie owned by Universal but apparently ATM there is no other way to watch it except to buy a DVD/Blu-Ray and a player. (I don't even own one anymore)

Some of my favorite images from any movie ever are of the size of the computer in the opening scenes.

Note: If you're the type of person that will laugh at the 1970s computers and because of that ridicule the movie then you're missing out.. For me, I grew up in the 70s and learned on computer with slow 30 characters per second (300 baud) text only displays and teletype terminals. The fact that the computers don't look like modern computers doesn't detract from the movie in any way IMO but if you lack the imagination to go there then it's not for you.

Comments

Car Rental throwing away money.

2022-06-07

Maybe someone in the car rental industry can help clarify things here but...

At some point in traveling I'd rent a car from Hertz. I'd get to the airport, ride the Hertz shuttle to the Hertz location. There, they'd announce something to the effect of

If you're a gold member look up your car on the dash board and then go straight to your car.

The rest of us had to go into the Hertz office and stand in line at the counter for 10-30 minutes depending on how crowded the counter is.

I ended up joining the gold club which is free.

Since then I've even been in locations where they say

Just pick any car you want in this section

This all got me wondering though, what is the benefit to Hertz of making people go to the counter? As far as I can tell they are just bleeding money. They could have nearly everyone just walk to their car, saving them on having to have 3 to 20 people at the counters and all the counter equipment.

Signing up for a gold membership, if I recall, is just about setting your preferences and verifying your data. There's every incentive for them to do this for all customers before they get to the car rental area. Even if those customer don't sign up for the "gold club" it would still save Hertz a bunch of work.

Note that even after you pick up the car, you drive to the exit and there are gates with employees who check your license and hand you your rental agreement. It usually takes 30 to 60 seconds. Which begs the question, why does it take so damn long at the counter? There, it feels like it's around 10 minutes per customer and the agent at the counter is typing constantly. What are they typing? This is true, even if you have a reservation which means you've already registered all the data they need!

It's seriously ridiculous. Even the airlines at least have automated agents where you type in your name, scan your id, it prints your boarding pass. You drop off your luggage. Done!

Hertz has to be throwing away 100s of millions of dollars a year by not doing both things above. (1) just letting you walk straight to your car (2) automating as much as possible.

But who knows. Maybe a Hertz employee can explain why it's possible to do it, as evidenced by the Gold Club, and why they don't just do it for everyone and save so much time and money.

Comments

What if good code colorization came before naming standards?

2022-06-04

Related to this post on time wasted because of naming standards, I just ran into this 2018 talk about tree-sitter. A fast language parser for code colorization written by Max Brunsfeld at github.

It seems pretty clear something like this had to be developed later than naming standards but its at least interesting to imagine what if it came first? Would we even need naming standards?

In the talk they point out that most editors use a complex set of regular expressions to guess how to color things.

Here's a pretty typical example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
class Foo {
 public:
  static Foo* create(int bar);
  int getBar();

 private:
  explicit Foo(int bar);
  int bar;
};

Foo::Foo(int bar) : bar(bar) {};

Foo* Foo::create(int bar) {
  return new Foo(bar);
}

int Foo::getBar() {
  return bar;
}

What to notice:

This is because most of the colorizers have no actual knowledge of the language. They just have a list of known language keywords and some regular expressions to guess at what is a type, a function, a string, a comment.

What if the colorizer actually understood the language?

For example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
class Foo {
 public:
  static Foo* create(int bar);
  int getBar();

 private:
  explicit Foo(int bar);
  int bar;
};

Foo::Foo(int bar) : bar(bar) {};

Foo* Foo::create(int bar) {
  return new Foo(bar);
}

int Foo::getBar() {
  return bar;
}

What to notice: Every type is green, every function is yellow, every bar is red when it's a member of a class. This means we don't need to name it _bar or mBar or bar_ as many style guides would suggest because the editor knows what it is and shows us by color.

We could also distinguish between member functions and global functions

void Foo::someMethod() {
  doThis();  // is this a member function or a global function?
  doThat();  // is this a member function or a global function?
}

Some of these issues go away by language design. In Python and JavaScript a member function and a property both have to be accessed by self / this so yes, there are other solutions than just coloring and naming conventions to help make code more understandable at a glance.

I haven't used tree-sitter directly (apparently it's used on Github for colorization though). I just found the idea that a language parsing colorizer could help make code more readable and help distinguish between things that naming conventions are often used for. I get that color isn't everywhere so it's maybe not a solution but it's still fun to think about what other ways we could make it easier to grok the code.

PS: The coloring above is hand-written and not via tree-sitter.

Comments

How many man years are wasted with western naming convensions?

2022-06-02

In most (all?) western languages there's the concept of UPPER case and lower case. I'm only guessing this is one reason why we have different styles of naming convensions. Commonly we have things like

ALL_UPPER_CASE_SNAKE_CASE_IS_A_CONSTANT
CaptializedCamelCaseAreClassNames
lower_case_snake_case_is_a_variable_or_member
lowerCaseCamelCaseIsAVariableOrMember

We often bake these into coding standards. Google's, Mozilla's, Apple's, Microsoft's

Being a person that grew up with a native language that has the concept of UPPER/lower case some of this seems normal but several languages, off the top of my head (Japanese, Chinese, Korean, Arabic, Hindi) have no UPPER/lower case. If programming had been invented or made popular by people whose native language was one of those would we even have the concept of camelCase?

In any case, the fact that we have these styles often leads to extra work.

For Example, I'm implementing OpenGL which has constants like GL_MAX_TEXTURE_SIZE. In the code, our style guide uses mCamelCase for class members so we have something like this

struct Limits {
  unsigned mMaxTextureSize;
  ...

There's then code that is effectively

void glGetIntegerv(GLenum pname, GLint* data) {
  const Limits& limits = getCurrentContext()->getLimits();
  switch (pname) {
    case GL_MAX_TEXTURE_SIZE:
      *data = limits.mMaxTextureSize;
      return;
    ...

Notice all this busy work. Someone had to translate GL_MAX_TEXTURE_SIZE into mMaxTextureSize. We could have instead done this

struct Limits {
  unsigned GL_MAX_TEXTURE_SIZE;
  ...
};

void glGetIntegerv(GLenum pname, GLint* data) {
  const Limits& limits = getCurrentContext()->getLimits();
  switch (pname) {
    case GL_MAX_TEXTURE_SIZE:
      *data = limits.GL_MAX_TEXTURE_SIZE;
      return;
    ...

In this second case, seaching for GL_MAX_TEXTURE_SIZE will find all references to the concept/limit we're interested in. In the previous case things are separated and we either have to search for each individually or we have write some more complex query. Further, we need to be aware of the coding style. Maybe in a different code base it's like this

struct Limits {
  unsigned max_texture_size;
  ...
};

void glGetIntegerv(GLenum pname, GLint* data) {
  const Limits& limits = getCurrentContext()->getLimits();
  switch (pname) {
    case GL_MAX_TEXTURE_SIZE:
      *data = limits.max_texture_size;
      return;
    ...

In fact I've worked in projects that are made from multiple parts where different parts have their own coding standards and yet more work is require to translate between the different choices.

The point is, time is spent translating to <-> from one form GL_MAX_TEXTURE_SIZE to another maxTextureSize.

You can automatic this process. Maybe you auto generate struct Limits above but you still ended up having to write code to do the translation from one form to another, you still have the search problem, and you still need to know which to reference.

It just had me wondering, how many man years of work would be saved if we didn't have this translation step which arguably only exists because of man made style guides, arguably influenced by the fact that western languages have the concept of letter case? I suspect, over all of the millions of programmers in the world, it's 100s or 1000s of man years of work per year possibly wasted just because of the effort of converting these forms.

Comments

ImHUI - first thoughts

2021-02-27

I'm a fan (and a sponsor) of Dear ImGUI. I've written a couple of previous articles on it including this one and this one

Lately I thought, I wonder what it would be like to try to make an HTML library that followed a similar style of API.

NOTE: This is not Dear ImGUI running in JavaScript. For that see this repo. The difference is most ImGUI libraries render their own text and graphics. More specifically they generate arrays of vertex positions, texture coordinates, and vertex colors for the glyphs and other lines and rectangles for your UI. You draw each array of vertices using whatever method you feel like. (WebGL, OpenGL, Vulkan, DirectX, Unreal, Unity, etc...)

This experiment is instead actually using HTML elements like <div> <input type="text">, <input type="range">, <button> etc...

This has pluses and minus.

The minus is it's likely not as fast as Dear ImGUI (or other ImGUI) libraries, especially if you've got a complex UI that updates at 60fps?

On the other hand it might actually be faster for many use cases. See below

The pluses are

Thoughts so far

So, what have I noticed so far...

Simpler to get your data in/out

What's nice about the ImGUI style of stateless UI is that you don't have to setup event handlers nor really marshall data in and out of UI widgets.

Consider standard JavaScript. If you have an <input type="text"> you probably have code something like this

const elem = document.createElement('input');
elem.type = 'text';
elem.addEventListener('input', (e) => {
  someObject.someProperty = elem.value;
});

You'd also need someway to update the element if the value changes

// when someObject.someProperty changes
elem.value = someObject.someProperty;

You now need some system for tracking when you update someObject.someProperty.

React makes this sightly easier. It handles the updating. It doesn't handle the getting.

function MyTextInput() {
  return {
    <input
      value={someObject.someProperty}
      onChange={function(e) { someObject.someProperty = this.value; }>
  }
}

Of course that faux react code above won't work. You need to use state or some other solution so that react knows to re-render when you change someObject.someProperty.

function MyTextInput() {
  const [value, setValue] = useState(someObject.someProperty);
  return {
    <input
      value={value}
      onChange={function(e) { setValue(this.value); }>
  }
}

So now React will, um, react to the state changing but it won't react to someObject.someProperty changing, like say if you selected a different object. So you have to add more code. The code above also provides no way to get the data back into someObject.someProperty so you have to add more code.

In C++ ImGUI style you'd do one of these

  // pass the value in, get the new value out
  someObject.someProperty = ImGUI::textInput(someObject.someProperty);

or

  // pass by reference. updates automatically
  ImGUI::textInput(someObject.someProperty);  

JavaScript doesn't support passing by reference so we can't do the 2nd style, OR, we could pass in some getter/setter pair to let the code change values.

  // pass the value in, get the new value out (still works in JS)
  someObject.someProperty = textInput(someObject.someProperty);

  // use a getter/setter generator
  textInput(gs(someObject, 'someProperty'));

Where gs is defined something like

function gs(obj, propertyName) {
  return {
    get() { return obj[propertyName]; },
    set(v) { obj[propertyName] = v; },
  };
}

In any case, it's decidedly simpler than either vanilla JS or React. There is no other code to get the new value from the UI back to your data storage. It just happens.

Less Flexible?

I'm not sure how to describe this. Basically I notice all the HTML/CSS features I'm throwing away using ImGUI because I know what HTML elements I'm creating underneath.

Consider the text function. It just takes a string and adds a row to the current UI

ImGUI::text('this text appears in a row by itself');

There's no way to set a className. There's no way to choose span instead of div or sub or sup or h1 etc..

Looking through the ImGUI code I see lots of stateful functions to help this along so (making up an example) one solution is some function which sets which type will be created

ImGUI::TextElement('div')
ImGUI::text('this text is put in a div');
ImGUI::text('this text is put in a div too');
ImGUI::TextElement('span')
ImGUI::text('this text is put in a span');

The same is true for which class name to use or adding on styles etc. I see those littered throughout the ImGUI examples. As another example

ImGUI::text('this text is on its own line');
ImGUI::text('this text can is not'); ImGui::SameLine();

Is that a plus? A Minus? Should I add more optional parameters to functions

ImHUI.text(msg: string, className?: string, type?: string)

or

ImGUI.text(msg: string, attr: Record<string, any>)

where you could do something like

ImGUI.text("hello world", {className: 'glow', style: {color: 'red'}});

I'm not yet sure what's the best direction here.

Higher level = Easier to Use

One thing I've noticed is that, at least with Dear ImGUI, more things are decided for you. Or maybe that's another way of saying Dear ImGUI is a higher level library than React or Vanilla JS/HTML.

As a simple example

ImGUI::sliderFloat("Degrees", myFloatVariable, -360, +360);

Effectively represents 4 separate concepts

  1. A label ("degrees") probably made with a <div>
  2. A slider. In HTML made with <input type="range">
  3. A number display. In this case probably a separate <div>
  4. A container for all 3 of those pieces.

So, is ImGUI actually simpler than HTML or is it just the fact that it has higher level components?

In other words, to do that with raw HTML requires creating 4 elements, childing the first 3 into one of them, responding to input events, updating the number display when an input event arrives. Updating both the number display and the <input> element's value if the value changes externally to the UI widgets.

But, if I had existing higher level UI components that already handled is that enough to make things easier? Meaning how much of Dear ImGUI's ease of use comes from its paradigm and how much from a large library of higher level widgets?

This is kind of like comparing programming languages. For given language, how much of the perceived benefit comes from the language itself and how much from the standard libraries or common environment it runs in.

Notes in implementation

getter setters vs direct assignment

ImGUI uses C++ ability to pass by reference. JavaScript has no ability to pass by reference. In other words in C++ I can do this

void multBy2(int& v) {
  v *= 2;
}

int foo = 123;
multBy2(foo);
cout << foo;    // prints 246

There is no way to do this in JavaScript.

Following the Dear ImGUI API I first tried to work around this by requiring you pass in an getter-setter like this

var foo = 123;
var fooGetterSetter =  {
  get() { return foo; }
  set(v) { foo = v; }
};

which you could then use like this

// slider that goes from 0 to 200
ImHUI.sliderFloat("Some Value", fooGetterSetter, 0, 200);

Of course if the point of using one of these libraries is ease of use then it sucks to have to make getter-setters.

I thought maybe I could make getter setter generators like the one gs shown above. It means for the easiest usage you're required to use objects so instead of bare foo you'd do something like

const data = {
  foo: 123,
};

...

// slider that goes from 0 to 200
ImHUI.sliderFloat("Some Value", gs(data, 'foo'), 0, 200);

That has 2 problems though. One is that it can't be type checked because you have to pass in a string to gs(object: Object, propertyName: string).

The other is it's effectively generating a new getter-setter on every invocation. To put it another way, while the easy to type code looks like the line just above, the performant code would require creating a getter-setter at init time like this

const data = {
  foo: 123,
};
const fooGetterSetter = gs(data, 'foo');

...
// slider that goes from 0 to 200
ImHUI.sliderFloat("Some Value", fooGetterSetter, 0, 200);

I could probably make some function that generates getters/setters for all properties but that also sounds yuck as it removes you from your data.

const data = {
  foo: 123,
};
const dataGetterSetters = generateGetterSetters(data)

...

// slider that goes from 0 to 200
ImHUI.sliderFloat("Some Value", dataGetterSetter.foo, 0, 200);

Another solution would be to require using an object and then make all the ImHUI functions take an object and a property name as in

// slider that goes from 0 to 200
ImHUI.sliderFloat("Some Value", data, 'foo', 0, 200);

That has the same issue though that because you're passing in a property name by string it's error prone and types can't be checked.

So, at least at the moment, I've ended up changing it so you pass in the value and it passes back a new one

// slider that goes from 0 to 200
foo = ImHUI.sliderFloat("Some Value", foo, 0, 200);

// or

// slider that goes from 0 to 200
data.foo = ImHUI.sliderFloat("Some Value", data.foo, 0, 200);

It's far more performant than using getter-setters, on top of being more performant than generating getter-setters. Further it's type safe. Eslint or TypeScript can both warn you about non-existing properties and possibly type mis-matches.

Figuring out the smallest building blocks

The 3rd widget I created was the sliderFloat which as I pointed out above consists of 4 elements, a div for the label, a div for the displayed value, an input[type=range] for the slider, and a container to arrange them. When I first implemented it I made a class that manages all 4 elements. But later I realized each of those 4 elements is useful on its own so the current implementation is just nested ImHUI calls. A sliderFloat is

function slideFloat(label: string, value: number, min: number = 0, max: number = 1) {
  beginWrapper('slider-float');
    value = sliderFloatNode(value, min, max);
    text(value.toFixed(2));
    text(prompt);
  endWrapper();
  return value;
}

The question for me is, what are the smallest building blocks?

For example a draggable window is currently hand coded as a combination of parts. There's the outer div, it's scalable. There's the title bar for the window, it has the text for the title and it's draggable to move the window around. Can I separate those so a window is built from these lower-level parts? That's something to explore.

Diagrams, Images, Graphs

You can see in the current live example I put in a version of ImGUI::plotLines which takes a list of values and plots them as a 2D line. The current implementation creates a 2D canvas using a canvasNode which returns a Canvas2DRenderingContext. In other words, if you want to draw something live you can build your own widget like this

function circleGraph(zeroToOne: number) {
  const ctx = canvasNode();
  const {width, height} = ctx.canvas;
  const radius = Math.min(width, height);
  ctx.beginPath();
  ctx.arc(width /2, height / 2, radius, 0, Math.PI * 2 * zeroToOne);
  ctx.fill();
}

The canvas will be auto-sized to fit its container so you just draw stuff on it.

The thing is, the canvas 2D api is not that fast. At what point should I try to use WebGL or let you use WebGL. If I use WebGL there's the context limit issues. Just something to think about. Given the way ImGUIs work if you have 1000 lines to draw then every time the UI updates you have to draw all 1000 lines. In C++ ImGUI that's just inserting some data into the vertex buffers being generated, but in JavaScript, with Canvas 2D, it's doing a lot more work to call into the Canvas2D API.

It's something to explore.

So far it's just an Experiment

I have no idea where this is going. I don't have any projects that need a GUI like this at the moment but maybe if I can get it into something I think is kind of stable I'd consider using it over something like dat.gui which is probably far and way the most common UI library for WebGL visualizations.

Comments

Done Answering Questions Stack Overflow

2021-02-16

Over the last ~9 years I've spent way too much time answering questions on stack overflow. I don't know why. I want to say it's because I like helping people. It's certainly not for "internet points". In fact I despise the gamification on Stack Overflow so much I tried to hide it from myself. Here is what my view of Stack Overflow looks like

Notice all the points are missing. I feel an unhealthy influence of points on all sites that have them so I turn them off. I'm convinced someday some scientific research will show they are detrimental to well being and will push to ban them or at least shame them out of existence.

In any case, yea, I spent way to much time answering questions on Stack Overflow. At the time I wrote this I had answered 27% of all the WebGL tagged questions on the site. Including other topics in total over 1900 answers. I also edited the tags of hundreds of wrongly tagged questions.

Many of my answers took hours to write. It could be figuring out a working solution or it could be debugging someone's code. I generally tried to post working code in as many answers as appropriate since in my opinion, working code is almost always better than just an explanation.

As a recent example, someone was trying to glue together two libraries and was running into issues. I got their minimal repo runnable, tracked down the issue, posted a working solution, and filed a bug report on one of the libraries. The entire process took about 2.5 hours.

I've also pointed out before that I wrote webglfundamentals.org and webgl2fundamentals.org in response to questions on stack overflow. WebGL is a verbose API. People ask questions and there is no simple answer. You could just give them some code that happens to work but they likely need 16 chapters of tutorials to understand that code. That's way too much for stack overflow.

So, 9 years ago I started writing articles to explain WebGL. I tried to go out of my way not have them be self promoting. The don't say "WebGL articles by GREGG TAVARES" In fact, except for the copyright license hidden in the code comments, IIRC my name is no where on the website. I'd even be happy to remove my name from the license though I'm not quite sure what legal implications there are. Can I just make something up like "copyright webglfundamentals.org"? I have no idea.

I even moved them from my github account to an organization. The hope was I could find more people to contribute if there was an org so you can participate in the org and not in my personal site. The sites are under "gfxfundametnals" not "greggman". Unfortunately no one has stepped up to write anything, though several volunteers have translated the articles into Chinese, Japanese, Russian, Korean, and other languages.

In any case, once I'd written the articles I would point people to them on Stack Overflow when it seemed appropriate. If, based on the issues they are having, someone is clearly new to WebGL, I might leave an answer that answers their specific question and then also leave a link to the effect of

You might find these articles useful.

If someone else had already written a good answer I might just add the same as a comment under the question.

Similarly if one of the articles addressed their particular issue I might link directly to it. Of course if I was answering I'd always leave a full answer, not just a link. I've been doing this for the least 9 years. It's clearly and unambiguously helpful to the user that asked the question as well as users reading later.

An example of this came up recently. Someone asked a question about how to use mat4 attributes. Someone else left an okay answer that answered the question, though it didn't give a good example. But, given the answer was good enough, I added a comment. "You might find this article useful..." because the article has a better example.

There were 2 other parts to the comment.

  1. The answer stated something incorrect. They claimed drawing different shapes with instancing is impossible. My comment pointed out it was not impossible and specified how to do it.

  2. That brought up another point which is if you want to draw multiple different models in a single draw call, I'd written an example to do that in a stack overflow answer and so I linked to it.

The next day I went to check if there was a new comment, in particular to see if the answerer had addressed their incorrect "it's impossible" blurb. They had, they'd removed that part of the answer. But, further my comment had been deleted!?!?!

WTF!!!!

The comment was triple useful. It was useful because it explained how something was possible. It was useful because it linked to a better working example the questioner needed. And, it was useful because it linked to a more flexible solution.

I didn't know this at the time but there is no record of deleted comments. I'd thought maybe I was dreaming. That 2.5 hours I spent on some other answer happened between 4am and 6am. I meant to go to sleep but got sucked into debugging. When I was finished, I checked for more questions, saw this one, and added the comment, but maybe I was too tired and forgot to press "submit"?

So I left the comment again, this time under the question itself since the answer had removed the part about something being impossible. This time I took a screenshot just so I'd know my memory wasn't bad.

I checked back later in the day to find the comment deleted. This prompted me to ask on meta, the stack overflow about stack overflow, what to do about on topic comments being over zealously deleted.

This is when I found out a bunch of things I didn't know

  1. Comments can be deleted by any moderator with for any reason. They don't like you? They can delete all your comments. They hate LGBT people and believe you're LGBT? They can delete your comments. This is one reason why there is no visible comment history.

  2. Comments are apparently meant to be ephemeral.

    Several people claimed comments have absolutely zero value. Therefore their deletion is irrelevant.

I found both of these claims rather ludicrous. Comments have a voting system. Some comments get hundreds of vote. Why would anyone design a voting system for something that has zero value?

Links to other stack overflow questions and answers in comments are scanned and used to show related links on the right side bar. If comments have zero value why would anyone make a system to scan them and display their info?

People can even link directly to other comments. What would be the point of implementing the ability to link to something that has zero value?

But further, I found that, according to various members, the links I'd been leaving are considered spam!!!!

According to these people, the links are nothing but self serving self promotion. More than worthless they considered them actively bad and I was a bad person for spamming the site with them. Here I was spending a few hundred hours writing these articles for users of stack overflow to reference when they needed more than would fit in an answer but apparently trying to tell them about these articles was against the rules.

Some claimed, though it was frowned on, it was slightly less shitty spam if I spelled out I wrote the articles when linking to them. There was no guarantee they wouldn't still be deleted, only that it was marginally less shitty if I declared my supposed conflict of interest.

To put it another way, if someone else posted the links it would be more okay because there is no conflict of interest. I don't buy that though. They're basically saying the exact same comment by person A is ok but by person B is not. That's effing stupid. Either the comment is useful to people reading it or it's not. Who posted it is irrelevant.

Well, this is straw that broke the camel's back.

I'm Done Answering Questions on Stack Overflow

Spending all the time answering people's questions and writing these article to help them was nothing but a burden anyway so I guess I should be thankful Stack Overflow corrected my delusion that I was being helpful and made it clear I was just a self serving spammer.

It's probably for the best anyway. I'll find some more productive way to use my time. To be clear, a bit has flipped in my head. My joy or compulsion or whatever it was that made me want to participate on Stack Overflow is gone or curred. Time to move on.

Comments

The Day Unity Broke The Internet

2021-02-03

Okay, hyperbole aside, Unity mistakenly hardcoded checking the browser's userAgent for MacOS 10 in their "WebGL" game support. MacOS 11 shipped a few months ago and Chrome/Edge started reporting MacOS 11 when Chrome/Edge is run on MacOS 11.

Result: Nearly all existing Unity games on itch.io, simmer.io, github.io, as well as Unity based visualizations on science sites, corporate training sites etc, stopped working for anyone running MacOS 11 using Chrome/Edge (and probably Brave, Vivaldi, etc...?)

It's an interesting issue

Unity shouldn't have hard coded checking for MacOS 10 and failing if it didn't exist

That's just a bug and they have since fixed it. Though ... DOH! Did it really take much thought not to write code that failed if not "10"?

Unfortunately there are many years of games and visualizations out there. It's unlikely most of those games will get updated. Further, even though Unity's main fix is to fix the bug in Unity itself, to apply it you'd have to re-compile your game in a newer version of Unity. Not only is that time consuming but it's no improbable your old project is not compatible with current versions of Unity and will require a bunch of work refactoring the code.

Fortunately you can just replace the file with the issue with a patched version of the file. Luckily that solves the issue and doesn't require you to recompile your game. Still, there will be 1000s of games that don't get this update. If we're lucky some of the sites will just do this automatically for their users.

But BTW, users are still uploading new games even today (Feb 2021) that have this bug as they are using older versions of Unity. Maybe some sites could check and warn the user?

It's been best practice for over a decade to NOT look at userAgent

This MDN article spells out why you shouldn't be looking at userAgent. You should instead be doing feature detection.

Unfortunately reality doesn't always meet expectations. Many web APIs have quirks that can not be easily detected. I didn't dig through the Unity code to see if what they were checking for was a "it can't be helped" kind of issue or a "we didn't know we could feature detect this" issue, but do know I personally have run into these kinds of issues and I also know, sometimes I could try to feature detect but it would be a PITA, meaning checking "If Safari, fall back to X" might take 2 lines of code where as checking that whatever browser I'm using actually follows the spec might take 50 lines of code and I'm lazy ๐Ÿ˜…

The userAgent is going away

Or at least in theory all the browser vendors have suggested they plan to get rid of the userAgent string or freeze it. Here's Chrome plans.

It's not clear what they are replacing it with is all that much better. It's better in that it needs less parsing? It sounds like it still provides all the same info to hang yourself with and to be tracked.

But, in some ways, it does possibly let Unity off the hook. AFAIK Chrome may decide to change their version string claiming MacOS 10 even on MacOS 11. Safari and Firefox already do this, I'm guessing for similar reasons, too many poorly coded sites broken. You might think Safari and Firefox don't report MacOS 11 because of tracking but if preventing tracking was their goal they wouldn't report the version of the browser in the userAgent, which they do.

Sometimes I want the userAgent

I recently wanted to write some software to check how many users can use feature X and I wanted to do it by checking which OS and which browser they are on so I can see for example, 70% of users on Safari, Mac can use feature X and 60% of users on Chrome, Android can use the same feature.

That seems like a reasonable thing to want to know so as much as I don't like being tracked I'm also not sure getting rid of the data available via userAgent is the best thing. It doesn't appear that data is going away though, just changing.

So?

I wrote "Unity broke the internet" mostly because Unity's many years old bug, spread over thousands of sites, potentially forced the browsers to work around those sites rather than progress forward. Unfortunately it's not the first time that's happened


Apparently the same thing happened to Firefox and they ended up adjusting their userAgent string. That rabbit hole lead me to this horror fest! ๐Ÿ˜ฑ

Comments

Zip - How not to design a file format.

2021-01-16

The Zip file format is now 32 years old. You'd think being 32 years old the format would be well documented. Unfortunately it's not.

I have a feeling this is like many file formats. They aren't designed, rather the developer just makes it up as they go. If it gets popular other people want to read and/or write them. They either try to reverse engineer the format OR they ask for specs. Even if the developer writes specs they often forget all the assumptions their original program makes. Those are not written down and hence the spec is incomplete. Zip is such a format.

Zip claims its format is documented in a file called APPNOTE.TXT which can be found here.

The short version is, a zip file consists of records, each record starts with some 4 byte marker that generally takes the form

0x50, 0x4B, ??, ??

Where the 0x50, 0x4B are the letters PK standing for "Phil Katz", the person who made the zip format. The two ?? are bytes that identify the type of the record. Examples

0x50 0x4b 0x03 0x04   // a local file record
0x50 0x4b 0x01 0x02   // a central directory file record
0x50 0x4b 0x06 0x06   // an end of central directory record

Records do NOT follow any standard pattern. To read or even skip a record you must know its format. What I mean is there are several other formats that follow some convention like each record id is followed by the length of the record. So, if you see an id, and you don't understand it, you just read the length, skip that many bytes (*), and you'll be at the next id. Examples of this type include most video container formats, jpgs, tiff, photoshop files, wav files, and many others.
(*) some formats require rounding the length up to the nearest multiple of 4 or 16.

Zip does NOT do this. If you see an id and you don't know how that type of record's content is structured there is no way to know how many bytes to skip.

APPNOTE.TXT says the following things

4.1.9 ZIP files MAY be streamed, split into segments (on fixed or on removable media) or "self-extracting". Self-extracting ZIP files MUST include extraction code for a target platform within the ZIP file.

4.3.1 A ZIP file MUST contain an "end of central directory record". A ZIP file containing only an "end of central directory record" is considered an empty ZIP file. Files MAY be added or replaced within a ZIP file, or deleted. A ZIP file MUST have only one "end of central directory record". Other records defined in this specification MAY be used as needed to support storage requirements for individual ZIP files.

4.3.2 Each file placed into a ZIP file MUST be preceded by a "local file header" record for that file. Each "local file header" MUST be accompanied by a corresponding "central directory header" record within the central directory section of the ZIP file.

4.3.3 Files MAY be stored in arbitrary order within a ZIP file. A ZIP file MAY span multiple volumes or it MAY be split into user-defined segment sizes. All values MUST be stored in little-endian byte order unless otherwise specified in this document for a specific data element.

4.3.6 Overall .ZIP file format:

      [local file header 1]
      [encryption header 1]
      [file data 1]
      [data descriptor 1]
      . 
      .
      .
      [local file header n]
      [encryption header n]
      [file data n]
      [data descriptor n]
      [archive decryption header] 
      [archive extra data record] 
      [central directory header 1]
      .
      .
      .
      [central directory header n]
      [zip64 end of central directory record]
      [zip64 end of central directory locator] 
      [end of central directory record]
   

4.3.7 Local file header:

      local file header signature     4 bytes  (0x04034b50)
      version needed to extract       2 bytes
      general purpose bit flag        2 bytes
      compression method              2 bytes
      last mod file time              2 bytes
      last mod file date              2 bytes
      crc-32                          4 bytes
      compressed size                 4 bytes
      uncompressed size               4 bytes
      file name length                2 bytes
      extra field length              2 bytes

      file name (variable size)
      extra field (variable size)
   

4.3.8 File data

Immediately following the local header for a file SHOULD be placed the compressed or stored data for the file. If the file is encrypted, the encryption header for the file SHOULD be placed after the local header and before the file data. The series of [local file header][encryption header] [file data][data descriptor] repeats for each file in the .ZIP archive.

Zero-byte files, directories, and other file types that contain no content MUST NOT include file data.

4.3.12 Central directory structure:

      [central directory header 1]
      .
      .
      . 
      [central directory header n]
      [digital signature] 
   

File header:

        central file header signature   4 bytes  (0x02014b50)
        version made by                 2 bytes
        version needed to extract       2 bytes
        general purpose bit flag        2 bytes
        compression method              2 bytes
        last mod file time              2 bytes
        last mod file date              2 bytes
        crc-32                          4 bytes
        compressed size                 4 bytes
        uncompressed size               4 bytes
        file name length                2 bytes
        extra field length              2 bytes
        file comment length             2 bytes
        disk number start               2 bytes
        internal file attributes        2 bytes
        external file attributes        4 bytes
        relative offset of local header 4 bytes

        file name (variable size)
        extra field (variable size)
        file comment (variable size)
   

4.3.16 End of central directory record:

      end of central dir signature    4 bytes  (0x06054b50)
      number of this disk             2 bytes
      number of the disk with the
      start of the central directory  2 bytes
      total number of entries in the
      central directory on this disk  2 bytes
      total number of entries in
      the central directory           2 bytes
      size of the central directory   4 bytes
      offset of start of central
      directory with respect to
      the starting disk number        4 bytes
      .ZIP file comment length        2 bytes
      .ZIP file comment       (variable size)
   

There are other details involving encryption, larger files, optional data, but for the purposes of this post this is all we need. We need one more piece of info, how to make a self extracting archive.

To do so we could look back to ZIP2EXE.exe which shipped with pkzip in 1989 and see what it does but it's easier look at Info-Zip to see what happens.

How do I make a DOS (or other non-native) self-extracting archive under Unix?

The procedure is basically described in the UnZipSFX man page. First grab the appropriate UnZip binary distribution for your target platform (DOS, Windows, OS/2, etc.), as described above; we'll assume DOS in the following example. Then extract the UnZipSFX stub from the distribution and prepend as if it were a native Unix stub:

> unzip unz552x3.exe unzipsfx.exe                // extract the DOS SFX stub
> cat unzipsfx.exe yourzip.zip > yourDOSzip.exe  // create the SFX archive
> zip -A yourDOSzip.exe                          // fix up internal offsets
> 

That's it. You can still test, update and delete entries from the archive; it's a fully functional zipfile.

So given all of that let's go over some problems.

How do you read a zip file?

This is undefined by the spec.

There are 2 obvious ways.

  1. Scan from the front, when you see an id for a record do the appropriate thing.

  2. Scan from the back, find the end-of-central-directory-record and then use it to read through the central directory, only looking at things the central directory references.

Scanning from the back is how the original pkunzip works. For one it means if you ask for some subset of files it can jump directly to the data you need instead of having to scan the entire zip file. This was especially important if the zip file spanned multiple floppy disks.

But, 4.1.9 says you can stream zip files. How is that possible? What if there is some local file record that is not referenced by the central directory? Is that valid? This is undefined.

4.3.1 states

Files MAY be added or replaced within a ZIP file, or deleted.

Okay? That suggests the central directory might not reference all the files in the zip file because otherwise this statement about files being added, replaced, or delete has no point to be in the spec.

If I have file1.zip that contains files, A, B, C and I generate file2.zip that only contains files A, B. Those are just 2 independent zip files. It makes zero sense to put in the spec that you can add, replace, and delete files unless that knowledge some how affects the format of a zip file.

In other words. If you have

  [local file A]
  [local file B]
  [local file C]
  [central directory file A]
  [central directory file C]
  [end of central directory]

Then clearly B is deleted as the central directory doesn't reference it. On the other hand, if there's no [local file B] then you just have an independent zip file, independent of some other zip file that has B in it. No need for the spec to even mention that situation.

Similarly if you had

  [local file A (old)]
  [local file B]
  [local file C]
  [local file A (new)]
  [central directory file A(new)]
  [central directory file B]
  [central directory file C]
  [end of central directory]

Then A (old) has been replaced by A (new) according to the central directory. If on the other hand there is no [local file A (old)] you just have an independent zip file.

You might think this is nonsense but you have to remember, pkzip comes from the era of floppy disks. Reading an entire zip file's contents and writing out a brand new zip file could be an extremely slow process. In both cases, the ability to delete a file just by updating the central directory, or to add a file by reading the existing central directory, appending the new data, then writing a new central directory, is a desirable feature. This would be especially true if you had a zip file that spanned multiple floppy disks; something that was common in 1989. You'd like to be able to update a README.TXT in your zip file without having to re-write multiple floppies.

In discussion with PKWare, they state the following

The format was originally intended to be written front-to-back so the central directory and end of central directory record could be written out last after all files included in the ZIP are known and written. If adding files, changes can applied without rewriting the entire file. This was how the original PKZIP application was designed to write .ZIP files. When reading, it will read the ZIP file end of central directory first to locate the central directory and then seek to any files it needs to access

Of course "add" is different than "delete" and "replace".

Whether or not having local files not referenced by the central directory is undefined by the spec. It is only implied by the mention of:

Files MAY be added or replaced within a ZIP file, or deleted.

If it is valid for the central directory to not reference all the local files then reading a zip file by scanning from the front may fail. Without special care you'd get files that aren't supposed to exist or errors from trying to overwrite existing files.

But, that contradicts 4.1.9 that says zip files maybe be streamed. If zip files can be streamed then both of the example above would fail because in the first case we'd see file B and in the second we'd see file A (old) before we saw that the central directory doesn't reference them. If you have to wait for the central directory before you can correctly use any of the entries then functionally you can not stream zip files.

Can the self extracting portion have any zip IDs in it?

Seeing the instructions for how to create a self extracting zip file above, we just prepend some executable code to the front of the file and then fix the offsets in the central directory.

So let's say your self extractor has code like this

switch (id) {
  case 0x06054b50:
    read_end_of_central_directory();
    break;
  case 0x04034b50:
    read_local_file_record();
    break;
  case 0x02014b50:
    read_center_file_record();
    break;
  ...
}

Given the code above, it's likely those values 0x06054b50, 0x04034b50, 0x02014b50 will appear in binary in the self extracting portion of the zip file at the front of the file. If you read a zip file by scanning from the front your scanner my see those ids and mis-interpret them as a zip records.

In fact you can imagine a self extractor with a zip file in it like this

// data for a zip file that contains
//   LICENSE.txt
//   README.txt
//   player.exe
const unsigned char[] runtimeAndLicenseData = {
  0x50, 0x4b, 0x03, 0x04, ??, ??, ...
};

int main() {
   extractZipFromFile(getPathToSelf());
   extractZipFromMemory(runtimeAndLicenseData, sizeof(runtimeAndLicenseData));
}

Now there's a zip file in the self extractor. Any reader that reads from the front would see this inner zip file and fail. Is that a valid zip file? This is undefined by the spec.

I tested this. The original PKUnzip.exe in DOS, the Windows Explorer, MacOS Finder, Info-Zip (the unzip included in MacOS and Linux), all clearly read from the back and see the files after the self extractor. 7z, Keka, see the embedded zip inside the self extractor.

Is that failure or is that a bad zip file? The APPNOTE.TXT does not say. I think it should be explicit here and I think it's one of those unstated assumptions. PKunzip scans from the back and so this just happens to work but the fact of how it happens to work is never documented. The issue that the data in the self-extractor might happen to resemble a zip file is just glossed over. Similarly streaming will likely fail if it hasn't already from the previous issues.

You might think this is a non issue but their are 100s of thousands of self extracting zip files out there from the 1990s in the archives. A forward scanner might fail to read these.

Can the zip comment contain zip IDs in it?

If you go look at 4.3.16 above you'll see the end of a zip file is a variable length comment. So, if you're doing backward scanning you basically read from the back of the file looking for 0x50 0x4B 0x05 0x06 but what if that sequence of bytes is in the comment?

I'm sure Phil Katz never gave it a second thought. He just assumed people would put the equivalent of a README.txt in there. As such it would only have values from 0x20 to 0x7F with maybe a 0x0D (carriage return), 0x0A (linefeed), 0x09 (tab) and maybe 0x06 (bell).

Unfortunately all of those values in the ids are valid ASCII, even utf-8. We already went over 0x50 = P and 0x4B = K. 0x06 is "Bell" in ASCII (makes a noise or flashes the screen). 0x05 is "Enquiry".

The APPNOTE.TXT should arguably explicitly specify if this is invalid. Indirectly 4.3.1 says

A ZIP file MUST have only one "end of central directory record"

But what does that mean? Does that mean the bytes 0x50 0x4B 0x05 0x06 can't appear in the comment nor the self extracting code? Does it mean the first time you see that scanning from the back you don't try to find a second match?

If you scan from the front and run into none of the issues mentioned before, then a forward scanner would successfully read this. On the other hand, pkunzip itself would fail.

What if the offset to the central directory is 1,347,093,766?

That offset is 0x504b0506 so it will appear to be end central directory header. I think 1.3gig zip file wasn't even on the radar when zip was created and in fact extensions were required to handle files larger then 4gig. But, it does show one more way the format is poorly designed.

What's a good design?

There's certainly debate to be had about what a good design would be but somethings are arguably easy to decide if we could start over.

  1. It would have been better if records had a fixed format like id followed by size so that you can skip a record you don't understand.

  2. It would have been better if the last record at the end of the file was just an offset-to-end-of-central-directory record as in

       0x504b0609 (id: some id is not in use)
       0x04000000 (size of data of record)
       0x???????? (relative offset to end-of-central-directory)
       

    Then there would be no ambiguity for reading from the back.

    1. Read the last 12 bytes
    2. Check the first 8 are 0x50 0x4b 0x06 0x09 0x04 0x00 0x00 0x00. If not, fail.
    3. Read the offset and go to the end-of-central-directory

    Or, conversely, put the comment in its own record and write it before the central directory and put an offset to it in the end-of-central-directory-record. Then at least this issue of scanning over the comment would disappear.

  3. Be clear about what data can appear in a self extracting stub.

    If you want to support reading from the front it seems required to state that the self extracting portion can't appear to have any records.

    This is hard to enforce unless you specifically wrote some validator. If you just check based on whether your own app can read the zip file then, as it stands now, Pkzip, pkunzip, info-zip (the zip in MacOS, Linux), Windows Explorer, and MacOS all don't care what's in the self extracting portion so they aren't useful for validation. You must explicitly state that you must scan from the back in the spec or write a validator that rejects zip that are not forward scanable and state in the spec why.

  4. Be clear if the central directory can disagree with local file records

  5. Be clear if random data can appear between records

    A backward scanner does not care what's between records. It only cares it can find the central directory and it only reads what that central directory points to. That means there can be any random data between records (or some at least some records).

    Be explicit if this is okay or not okay. Don't rely on implicit diagrams.

What to do, how to fix?

If I was to to guess all of these issues are implementation details that didn't make it into the APPNOTE.TXT. What I believe the APPNOTE.TXT really wants to say is "a valid zip file is one that pkzip can manipulate and pkunzip can correctly unzip. Instead it defines things in such a way that various implementations can make files that other implementations can't read.

Of course with 32 years of zip files out their we can't fix the format. What PKWare could do is get specific about these edge cases. If it was me I'd add these sections to the APPNOTE.TXT

4.3.1 A ZIP file MUST contain an "end of central directory record". A ZIP file containing only an "end of central directory record" is considered an empty ZIP file. Files MAY be added or replaced within a ZIP file, or deleted. A ZIP file MUST have only one "end of central directory record". Other records defined in this specification MAY be used as needed to support storage requirements for individual ZIP files.

The "end of central directory record" must be at the end of the file and the sequence of bytes, 0x50 0x4B 0x05 0x06, must not appear in the comment.

The "central directory" is the authority on the contents of the zip file. Only the data it references are valid to read from the file. This is because (1) the contents of the self extracting portion of the file is undefined and might be appear to contain zip records when in fact they are not related to the zip file and (2) the ability to add, update, and delete files in a zip file stems from the fact that it is only the central directory that knows which local files are valid.

That would be one way. I believe this will read the 100s of millions of existing zip files out there.

On the other hand, if PKWare claims such files that have these issues don't exist then this would work as well

4.3.1 A ZIP file MUST contain an "end of central directory record". A ZIP file containing only an "end of central directory record" is considered an empty ZIP file. Files MAY be added or replaced within a ZIP file, or deleted. A ZIP file MUST have only one "end of central directory record". Other records defined in this specification MAY be used as needed to support storage requirements for individual ZIP files.

The "end of central directory record" must be at the end of the file and the sequence of bytes, 0x50 0x4B 0x05 0x06, must not appear in the comment.

There can be no [local file records] that do not appear in the central directory. This guarantee is required so reading a file front to back provides the same results as reading it back to front. Any file that does not follow this rule is an invalid zip file.

A self extracting zip file must not contain any of the sequences of record ids listed in this document as they maybe mis-interpreted by forward scanning zip readers. Any file that does not follow this rule is an invalid zip file.

I hope they will update the APPNOTE.TXT so that the various zip readers and zip creators can agree on what's valid.

Unfortunately I feel like pkware doesn't want to be clear here. Their POV seems to be that zip is an ambiguous format. If you want to read by scanning from the front then just don't try to read files you can't read that way. They're still valid zip files and but the fact that you can't read them is irrelevant. It's just your choice to fail to support those.

I suppose that's a valid POV. Few if any zip libraries handle every feature of zip. Still, it would be nice to know if you're intentionally not handling something or if you're just reading the file wrong and getting lucky that it works sometimes.

The reason all this came up is I wrote a javascript unzip library. There are tons out here but I had special needs the other libraries I found didn't handle. In particular I needed a library that let me read a single file from a large zip as fast as possible. That means backward scanning, finding the offset to the desired file, and just decompressing that single file. Hopefully others find it useful.


You might find this history of Zip fascinating

Comments
older