JSBenchIt.org and JSGist.org

2020-11-09

I recently made 2 new sites. They came about like this.

Once in a while I want to benchmark solutions in JavaScript just to see how much slower one solution is vs another. I used to use jsperf.com but sometime in early 2020 or 2019 it disappeared.

Searching around I found 2 others, jsbench.me. Trying them they both have their issues. Jsbench.me is ugly. Probably not fair but whatever. Using it bugged me. Jsben.ch, at least as of this writing had 2 issues when I tried to use it. One is that if my code had a bug the site would crash as in it would put up a full window UI that blocks everything with a "running..." message and then never recovers. The result was all the code I typed in was lost and I'd have to start over. The other is it has a 4k limit. 4k might sound like a lot but I ran into that limit trying to test a fairly simple thing. I managed to squeeze my test in with some work but worse, there's no contact info anywhere except a donate button that leads directly to the donation site, not a contact sight so there's no way to even file a bug let alone make a suggestion.

In any case I put up with it for 6 months or so but then one day about a month ago, I don't remember what triggered it but I figured I could make my own site fairly quickly where I'm sure in my head quickly meant 1-3 days max. ๐Ÿ˜‚

benchmark.js

So, this is what happened. First I decide I should use benchmark.js mostly because I suck at math and it claims "statistically significant results". I have no idea what that means ๐Ÿ˜… but, a glance at the code shows some math happening that's more than I'd do if I just wrote my own timing functions.

Unfortunately I'd argue benchmark.js is actually not a very good library. They made up some username or org name called "bestiejs" to make it sound like it's good and they claim tests and docs but the docs are horrible auto-generated docs. They don't actually cover how to use the library they just list a bunch of classes and methods and it's left to you to figure out which functions to call and when and why. There's also some very questionable design choices like the way you add setup code is by manually patching the prototype of one of their classes. WAT?!?

I thought about writing my own anyway and trying to extract the math parts but eventually I got things working enough and just wanted to move on.

Personal Access Tokens

I also didn't want to run a full server with database and everything else so I decided I'd see if it was possible to store the data in a github gist. It turns out yes, it's possible but I also learned there is no way to make a static website that supports Oauth to let the user login to github.

A workaround is a user can make a Personal Access Token which is a fancy way of basically making a special password that is given certain permissions. So, in order to save the user would have to go to github, manually make a personal access token, paste it into jsbenchit.org and then they could save. It worked! ๐ŸŽ‰

As I got it working I released I could also make a site similar to jsfiddle or codepen with only minor tweaks to the UI so I started on that too.

ยฏ\(ใƒ„)/ยฏ

Running User Code

Both sites run arbitrary user code and so if I didn't do something people could write code that steals the personal access token. That's no good. Stealing their own token is not an issue but passing a benchmark or jsgist to another user would let them steal that user's token.

The solution is to run the user's code in an iframe on another domain. That domain can't read any of the data from the main site so problem is solved.

Unfortunately I ran into a new problem. Well, maybe it's not so new. The problem is since the servers are static I can't serve the user's code like a normal site would. If you look at jsfiddle, codepen, and stack overflow snippets you'll see they run the code from a server served page generated using the user's code. With a static site I don't have that option.

To work around it I generated a blob, make a URL to the blob and have the browser load that in an iframe. I use this solution on webgfundmentals.org, webgl2fundamentals.org, and threejsfundamentals.org It works but it has other problems. One is since I can't serve any files whatsoever I have to re-write URLs if you use more than 1 file.

Take for example something that uses workers. You usually need a minimum of 2 files. An HTML file with a <script> section that launches a worker and the worker's script is in another file. So you start with main.html that loads worker.js but you end up with blob:1234-1314523-1232 for main.html but it's still referencing worker.js and you have to some how find that and change it to the blob url that was generated for worker.js. I actually implemented this solution on those sites I mentioned above but it only works because I wrote all the examples that are running live and the solutions only handle the small number of cases I needed to work.

The second problem with the blob solution they are no good for debugging. Every time the user clicks "run" new blobs are created so any breakpoints you set last time you ran it don't apply to the new blob since they're associated with a URL and that URL has just changed.

Looking into it I found out I could solve both problems with a service worker. The main page starts the service worker then injects the filename/content of each file into the service worker. It then references those files as normal URLs so the don't change. Both problems are solved. ๐Ÿ˜Š

I went on to continue making the sites even though I was way past the amount of time I thought I'd be spending on them.

Github Login

In using the sites I ran into a new problems. Using a personal access token sucked! I have at least 4 computers I want to run these on. A Windows PC, a Mac, an iPhone, and an Android phone. When I'd try to use a different PC I needed to either go through the process of making a new personal access token, or I needed to find someway to pass that token between machines, like email it to myself. ๐Ÿคฎ

I wondered if I could get the browser to save it as a password. It turns out, yes, if you use a <form> and an <input type="password"> and you apply the correct incantation when the user clicks a "submit" button the browser will offer to save the personal access token as a password.

Problem solved? No ๐Ÿ˜ญ

A minor issue is there's no username but the browsers assume it's always username + password. That's not that big a deal, I can supply a fake username though it will probably confuse users.

A major issue though is that passing between machines via a browser's password manager doesn't help pass between different browsers. If I want to test Firefox vs Chrome vs Safari then I was back to the same problem of keeping track of a personal access token somewhere.

Now I was entering sunk cost issues. I'd spent a bunch of time getting this far but the personal access token issues seemed like they'd likely make no one use either site. If no one is going to use it then I've wasted all the time I put in already.

Looking into it more it turns out the amount of "server" need to support oauth so that users could log in with github directly is actually really tiny. No storage is needed, almost nothing.

Basically they way Oauth works is

  1. user clicks "login with github"
  2. static page opens a popup to https://github.com/login/oauth/authorize and passes it an app id (called client_id), the permissions the app wants, and something called "state" which you make up.
  3. The popup shows github's login and asks for permission for the app to use whatever features it requested.
  4. If the user picks "permission granted" then the github page redirects to some page you pre-registered when you registered your app with github. For our case this would be https://jsbenchit.org/auth.html. To this page the redirect includes a "code" and the "state" passed at step 2.
  5. The auth.html page either directly or by communicating with the page that opened the popup, first verifies that the "state" matches what was sent at step 2. If not something is fishy, abort. Otherwise it needs to contact github at a special URL and pass the "code", the "client_id" and a "client_secret".

    Here's the part that needs a server. The page can't send the secret because then anyone could read the secret. So, the secret needs to be on a server. So,

  6. The page needs to contact a server you control that contains the secret and passes it the "code". That server then contacts github, passes the "code", "client_id", and "client_secret" to github. In response github will return an access token which is exactly the same as a "personal access token" except the process for getting one is more automated.
  7. The page gets the access token from the server and starts using it to access github's API

If you were able to follow that the short part is you need a server, and all it has to do is given a "code", contact github, pass the "code", "client_id" and "client_secret" on to github, and pass back the resulting token.

Pretty simple. Once that's done the server is no longer needed. The client will function without contacting that server until and unless the token expires. This means that server can be stateless and basically only takes a few lines of code to run.

A found a couple of solutions. One is called Pizzly. It's overkill for my needs. It's a server that provides the oauth server in step 6 above but it also tracks the tokens for you and proxies all other github requests, or requests to whatever service you're using. So your client side code just gets a pizzly user id which gets translated for you to the correct token.

I'm sure that's a great solution but it would mean paying for a much larger server, having to back up user accounts, keep applying patches as security issues are found. It also means paying for all bandwidth between the browser can github because pizzly is in the middle.

Another repo though made it clear how simple the issue can be solved. It's this github-secret-keeper repo. It runs a few line node server. I ran the free example on heroku and it works! But, I didn't want to make an heroku account. It seemed too expensive for what I needed it for. I also didn't feel like setting up a new dynamo at Digital Ocean and paying $5 a month just to run this simple server that I'd have to maintain.

AWS Lambda

I ended up making an AWS Lambda function to do this which added another 3 days or so to try to learn enough AWS to get it done.

I want to say the experience was rather poor IMO. Here's the problem. All the examples I found showed lambda doing node.js like stuff, accepting a request, reading the parameters, and returning a response. Some showed the parameters already parsed and the response being structured. Trying that didn't work and it turns out the reason is AWS for this feature is split into 2 parts.

Part 1 is AWS Lambda which just runs functions in node.js or python or Java etc...

Part 2 is AWS API Gateway which provides public facing endpoints (URLS) and routes them to different services on AWS, AWS Lambda being one of those targets.

It turns out the default in AWS API Gateway doesn't match any of the examples I saw. In particular the default in AWS API Gateway is that you setup a ton of rules to parse and validate requests and parameters and only if they parse correctly and pass all the validation do they get forwarded to the next service. But that's not really what the example shown wanted. Instead they wanted AWS API Gateway to effectively just pass through the request. That's not the default and I'd argue it not being the default is a mistake.

My guess is that service was originally written in Java. Because Java is strongly typed it was natural to think in terms of making the request fit strong types. Node.js on the other hand, is loosely typed. It's trivial to take random JSON, look at the data you care about, ignore the rest, and move on with your life.

In any case I finally figured out how to get AWS API Gateway to do what all the AWS Lambda examples I was seeing needed and it started working.

My solution is here if you want to use it for github or any Oauth service.

CSS and Splitting

Next up was spitting and CSS. I still can't claim to be a CSS guru in any way shape or form and several times I year I run into infuriating CSS issues where I thought I'd get something done in 15 minutes but turns into 15 minutes of the work I thought I was going to do and 1 to 4 hours of trying to figure out why my CSS is not working.

I think there are 2 big issues.

  1. is that Safari doesn't match Chrome and Firefox so you get something working only to find it doesn't work on Safari

  2. Nowhere does it seem to be documented how to make children always fill their parents. This is especially important if you're trying to make a page that acts more like an app where the data available should always fit on the screen vs a webpage that be been as tall as all the content.

    To be more clear you try (or I try) to make some layout like

       +--------------------+
       |                    |
       +---+------------+---+
       |   |            |   |
       |   |            |   |
       |   |            |   |
       +---+-----+------+---+
       |         |          |
       +---------+----------+
       

    and I want the entire thing to fill the screen and the contents of each area expand to use all of it. For whatever reason it never "just works". I'd think this would be trivial but something about it is not or at least not for me. It's always a bunch of sitting in the dev tools and adding random height: 100% or min-height: 0 or flex: 1 1 auto; or position: relative to various places in the hope things get fixed and they don't break something else or one of the other browsers. I'd think this would be common enough that the solution would be well documented on MDN or CSS Tricks or some place but it's not or at least I've never found it. Instead there's just all us clueless users reading the guesses of other clueless users sharing their magic solutions on Stack Overflow.

    I often wonder if any of the browser makers or spec writers ever actually use the stuff they make and why they don't work harder to spread the solutions.

    Any any case my CSS seems to be doing what I want at the moment

That said, I also ran into the issue that I needed a splitter control that let you drag the divider between two areas to adjust their sizes. There's 3 I found but they all had issues. One was out of date, and unmaintained and got errors with current React. Yea, I used react. Maybe that was a bad decision. Still not sure.

After fighting with the other solutions I ended up writing my own so that was a couple of days of working through issues.

Disqus

Next up was comments. I don't know why I felt compelled to add comments but I did. I felt like people being able to comment would be net positive. Codepen allows comments. The easiest thing to do is just tack on disqus. Similar to the user code issue though I can't use disqus directly on the main site otherwise they could steal the access token.

So, setup another domain, put disqus in an iframe. The truth is disqus already puts itself in an iframe but at the top level it does this with a script on your main page which means they can steal secrets if they want. So, yea, 3rd domain (2nd was for user code).

The next problem is there is no way in the browser to size an iframe to fit its content. It seems ridiculous to have that limitation in 2020 but it's still there. The solution is the iframe sends messages to the parent saying what size its content is and then the parent can adjust the size of the iframe to match. It turns out this is how disqus itself works. The script it uses to insert an iframe listens for messages to resize the iframe.

Since I was doing iframe in iframe I needed to re-implement that solution.

It worked, problem solved..... or is it? ๐Ÿ˜†

Github Comments

It's a common topic on tech sites but there is a vocal minority that really dislike disqus. I assume it's because they are being tracked across the net. One half solution is you put a click through so that by default disqus doesn't load but the user can click "load comments" which is effectively an opt in to being tracked.

The thing is, gists already support comments. If only there was a way to use them easily on a 3rd party site like disqus. There isn't so, ....

There's an API where you can get the comments for a gist. You then have to format them from markdown into HTML. You need to sanitize them because it's user data and you don't want people to be able to insert JavaScript. I was already running comments on a 3rd domain so at least that part is already covered.

In any case it wasn't too much work to get existing comments displayed. New comments was more work though.

Github gists display as follows

+----------+
| header   |
+----------+
| files    |
|          |
|          |
+----------+
| comments |
|          |
|          |
+----------+
| new      |
| comment  |
| form     |
+----------+

that comment form is way down the page. If there was an id to jump to I could have possibly put that page in an iframe and just use a link like https://gist.github.com/<id>/#new-comment-form. to get the form to appear in a useful way. That would give the full github comment UI which includes drag and drop image attachments amount other things. Even if putting it in an iframe sucked I could have just had a link in the form of

<a href="https://gist.github.com/<id>/#new-comment-form">click here to leave a comment</a>

But, no such ID exits, nor does any standalone new comment form page.

So, I ended up adding a form. But for a preview we're back to the problem of user data on a page that has access to a github token.

The solution to put the preview on a separate page served from the comments domain and send a message with new content when the user asks for a preview. That way, even if we fail to fully sanitize the user content can't steal the tokens.

Embedding

Both sites support embedding

jsgist.org just uses iframes.

JsBenchIt supports 2 embed modes. One, uses an iframe.

+ there's no security issues (like I can't see any data on whatever site you embedded it)

- It's up to you to make your iframe fit the results

The other mode uses a script

+ it can auto size the frame

- if I was bad I could change the script and steal your login credentials for whatever site you embed it on.

Of course I'd never do that but just to be aware. Maybe someone hacks my account or steals my domain etc... This same problem exists for any scripts you use from a site you don't control like query from a CDN for example so it's not uncommon to use a script. Just pointing out the trade off.

I'm not sure what the point of embedding the benchmarks is but I guess you could should of your special solution or, show how some other solution is slow, or maybe post one and encourage others to try to beat it.

Closing Thoughts

I spent about a month, 6 to 12hrs a day on these 2 sites so far. There's a long list of things I could add, especially to jsgist.org. No idea if I will add those things. jsbenchit.org has a point for me because I didn't like the existing solution. jsgist.org has much less of a point because are are 10 or sites that already do something like this in various ways. jsfiddle, codepen, jsbin, codesandbox, glitch, github codespaces, plunkr, and I know there are others so I'm not sure what the point was. It started as just a kind of "oh, yea, I could do that too" while making jsbenchit and honestly I spent probably spent 2/3rds of the time there vs the benchmark site.

I honestly wish I'd find a way to spend this kind of time on something that has some hope of generating income, not just income but also sometime I'm truly passionate about. Much of this feels more like the procrastination project that one does to avoid doing the thing they should really do.

That said, the sites are live, they seem to kind of work though I'm sure there are still lurking bugs. Being stored in gists the data is yours. There is no tracking on the site. The sites are also open source so pull requests welcome.

Comments

Embedded Scripts - Stop it!

2020-10-14

I recently started making a website. I needed to store some credientials info locally in the user's browser. I had to give some thought that I can't let 3rd parties access those credientials and that's lead to a bunch of rabbit holes.

It's surprising the number of services out there that will tell you to embed their JavaScript into your webpage. If you do that then those scripts could be reading all the data on the page including login credientials, your credit card number, contact info, whatever else is on the page.

In other words, for example, to use the disqus comment service you effectively add a script like this

<script src="yourblog.disqus.com/embed.js"></script>

Disqus uses that to insert an iframe and then show all the comments and the UI for adding more. I kind of wanted to add comments to the site above via disqus but there's no easy way to do it securely. The best I can think of is I can make a 2nd domain so that on the main page I create an iframe that links to the 2nd domain and that 2nd domain then includes that disqus script.

I'm not dissing disqus, I'm just more surprised this type of issue is not called out more as the security issue it is.

I looked into how codepen allows embedding a pen recently. Here's the UI for embedding

Notice of the 4 methods they mark HTML as recommended. Well if you dig through the HTML you see it does this

<script async src="https://static.codepen.io/assets/embed/ei.js"></script>

Yes, it powns your page. Fortunately they offer using an iframe but it's surprising to me they recommend the insecure, we own your site, embed our script directly on your page option over the others. In fact I'd argue it's irresponsible for them offer that option at all. I'm not trying to single out codepen, it's common across may companies. Heck, Google Analytics is probably the most common embedded script with Facebook's being second.

I guess what goes through most people's heads who make this stuff is "we're trustworthy so nothing to worry about". Except,

  1. It sets a precedent to trust all such similar sites offering embedded scripts

  2. I might be able to trust "you" but I can I trust all your employees and successors?

    We're basically setting up a world of millions of effectively compromised sites and then praying that it doesn't become an issue sometime in the future.

  3. Even if I trust you you could be compelled to use your backdoor.

    I suppose this is unlikely but who knows. Maybe the FBI comes knocking requesting that for a specific site you help them steal credientials because they see your script is on the site they want to hack or get info from.

Anyway, I do have comments on this site by disqus using their script and I have google analytics on here too. This site though has no login, there are no credientials or anything else to steal. For the new site though I'll have to decide on whether or not I want to run comments at all and if so setup the second domain.

Comments

GitHub has a Permission Problem.

2020-09-27

TL;DR: Thousands of developers are giving 3rd parties write access to their github repos. This is even more irresponsible than giving out your email password or your computer's password since your github repos are often used by more than just you. The tokens given to 3rd parties are just like passwords. A hacker that breaches a company that has that info will suddenly have write access to every github repo the breached company had tokens for.

github should work to stop this irresponsible practice.

I really want to scream about security on a great many fronts but today let's talk about github.

What the actual F!!!

How is this not a 2000 comment topic on HN and Github not shamed into fixing this?

Github's permission systems are irresponsible in the extreme!!

Lots of sites let you sign up via github. Gatsby is one. Here's the screen you get when you try to sign up via your github account.

Like seriously, WTF does "Act on your behalf" mean? Does it mean Gatsby can have someone assassinated on my behalf? Can they take out a mortgage on my behalf? Can they volunteer me for the Peace Corps on my behalf? More seriously can they scan all my private repos on my behalf? Insert trojans in my code on my behalf? Open pull requests on other people's projects on my behalf? Log in to every other service I've connected to my github account on my behalf? Delete all my repos on my behalf? Add users to my projects on my behalf? Change my password on my behalf?

This seems like the most ridiculous permission ever!

I bought this up with github and they basically threw up their hands and said "Well, at least we're telling you something". No you're not. You're effectively telling me absolutely nothing except that you're claiming if I click through you're giving that company permission to do absolute anything. How is that useful info?

But, just telling me isn't really the point. The point is each service should be required to use as small of permissions as is absolutely necessary. If I sign up for a service, the default should be no permissions except getting my email address. If a service is supposed to work with a repo (like gatsby is) then github should provide an interface such that gatsby tells github "Give me a list of repos the user wants me to use" and github present the UI to select an existing one or create new ones and when finished, only those repos are accessible and only with the minimal permissions need.

This isn't entirely github's fault though, the majority of the development community seems asleep as well.

Let's imagine your bank let you sign in to 3rd party services in a similar manner. How many people would click through on "Let ACME corp act on your behalf on your Citibank Account". I think most people would be super scared of permissions like that. Instead they'd want very specific permission like, only permission to deposit money, or only permission to read the balance, or only permission to read transactions, etc...

Github providing blanket permissions to so many companies is a huge recipe for disaster just waiting to happen. If any one of those companies gets hacked, or has insider help, or has a disgruntled employee, suddenly your trade secrets are stolen, your unreleased app is leaked, your software is hacked with a trojan and all your customers sue you for the loss of work your app caused. It could be worse, you could run an open source library so by hacking ACME corp the bad guys can hack your library and via that hack everyone using your library.

I get why github does it and/or why the apps do it. For example check out Forestry. They could ask for minimal permissions and good on them for providing a path to go that route. They ask for greater permissions so that they can do all the steps for you. I get that. But if you allow them blanket access to your github (or gitlab), YOU SHOULD ARGUABLY BE DISQUALIFIED FROM BEING A SOFTWARE DEVELOPER!!!

The fact that you trusted some random 3rd party with blanket permissions to edit all of your repos and change all of your permissions is proof you don't know WTF you're doing and you can't be trusted. It's like if someone asked you for the password to your computer. If you give it out you're not computer literate!

boy: "Is it ok I set my password to your birthday?

girl: "Then your password is meaningless!"

Here's the default permissions Forestry asks for if you follow their recommended path.

First let's explain what Forestry is. It's a UI for editing blog posts through git so you can have a nice friendly interface for your git based static site generator. That's great! But, at most it only needs access to a single repo. Not all your public repos! If you click through and picked "Authorize" that's no different then giving them the password to your computer. Maybe worse because at least your hacked computer will probably only affect you.

Further, the fact that companies like Forestry even ask for this should be shameful! Remember when various companies like Facebook, Yelp, when you signed up they'd ask for your the username and password for your email account? Remember how pretty much every tech person on the planet knew that was seriously irresponsible to even ask? Well this is no different. It's entirely irresponsible for Forestry to ask for these kind of blanket permissions! It's entirely irresponsible for any users to give them these permissions! How are all the tech leaders seemingly asleep at calling this out?

Like I mentioned above, part of this arguably lies at Github's feet. Forestry does this because github provides no good flow to do it well so Forestry is left with 2 options (1) be entirely irresponsible but make it easy for the user to use their service, (2) be responsible but lose sales because people can't get setup easily.

Instead it should be a sign they're an irresponsible and untrustworthy company that they ask for these kinds of permissions at all. And further, github should be should also be ashamed their system encourages these kinds of blanket permissions.

Think of it this way. There are literally millions of open source libraries. npm has over a million libraries and that's just JavaScript. Add in python, C++, Java, C#, ruby, and all the other projects on github. Hundreds of thousands of developers wrote those libraries. How many of those developers have given out the keys their repos so random 3rd parties can hack their libraries? Maybe they gave too broad permissions to some code linting site. Maybe they gave too broad permissions to some project monitoring site. Maybe they gave too broad permissions just to join a forum using their github account. Isn't that super irresponsible? They've opened a door by using the library and they're letting more people in the door. That can't be good.

I don't blame the devs so much as github for effectively making this common. Github needs to take security seriously and that means working to make issues like this the exception, not the rule. It should be the easiest thing to do to allow a 3rd party minimal access to your repo. It should be much harder to give them too much access. There should be giant warnings that you're about to do something super irresponsible and that you should probably not be trusting the company asking for these permissions.

Call it out!

I don't mean to pick on Forestry. 100s of other github (and gitlab?) integrations have the same issues. Forestry was just the latest one I looked at. I've seen various companies have this issue for years now and I've been seriously surprised this hasn't been a bigger topic.


Don't clutter the UX with meaningless info

Look above at the Github permissions. Reading public info should not even be listed! It's already obvious that all your public info can be read by the app. That's the definition of public! There's no reason to tell me it might read it. It doesn't need permission to do so.

Comments

What if Google Was Like YouTube?

2020-09-26

This was just a random brain fart but ...

I get the impression that for many topics, youtube is more popular than web pages. Note: I have zero proof but it doesn't really matter for the point of this article.

Let's imaging there is a website that teaches JavaScript, for example this one.

Note: I have no idea how many people go to that site but compare it to this youtube channel which has millions of views.

For example this one video has 2.8 million views and it's just one of 100s of videos.

I have no idea but I suspect the youtube channel is far more viewed than the website.

Why is that?

At first I thought it was obvious, it's because more people like videos more than they like text for these topics. It's certainly easy to believe. Especially the younger generation, pretty much anyone under 25 has grown up with YouTube as part of their life.

There are lots of arguments to be made for video for learning coding. Seeing someone walk through the steps can be better than reading about how to do it. For one, it's unlikely someone writing a tutorial is going to remember to detail everything where as someone making a video is at least likely showing the actual steps on the video. Small things they might have forgotten to write down appear in the video.

On the other hand, video sucks for reference and speed. I can't currently search the content of video. While I can cue a video and set it to different time points that's much worse than being able to skim or jump to the middle of an article.

Anyway, there are certainly valid reason why a video might be more popular than an article on the same topic.

BUT!

What if one of the major reasons why videos are more popular than articles is because of YouTube itself. You go to youtube and based on what you watched before it recommends other things to watch. You watch one video on how to code in JavaScript and it's going to recommend watching more videos about programming in JavaScript and programming in general. It's also going to ask you to subscribe to those channels. You might even be setup to get emails when a youtuber posts a new video to their channel.

So, Imagine Google's home page worked the same way. Imagine instead of this

It looked more like this

Even before you searched you'd see recommendations based on things you searched for or viewed before. You'd see things you subscribed to. You'd see marks for stuff you'd read before. Your history would be part of the website just like it is on youtube. Google could even keep the [+] button in top right which would lead to sites to create your content.

I can hear a lot of various responses.

I think it would be an interesting experiment. If not Google's current home page than some new one, youweb.com or something.

Like youtube it would mark what you've already read. Like youtube it would allow people to make channels. RSS is ready in place to let people add their channels. Not sure how many systems still support this but there was a standard for learning where the page is for adding new content so clicking the [+] button could take you there, where ever it is and Google could suggest places if you want to start from scratch including squarespace or wordpress.com or even blogger ๐Ÿ˜‚

I think it might be super useful to have more sites recommended to me based on my interests. I watch youtube. I look at the recommendations. In fact I appreciate the recommendations. Why should websites be any different? Unlike Youtube the web is more decentralized so that's actually a win over Youtube. Why shouldn't Google (or someone) offer this service?

I'm honestly surprised it hasn't been done already. It probably has but I just forgot or didn't notice.

This might also make the tracking more visible. People claim Google knows all the sites you visit. Well, why not show it? If there's a Google analytics script on some site and Google recorded you went there, then you go to Google's home page and there in your history, just like Youtube's history, is a list of the sites you've visited. This would make it far more explicit so advocates for privacy could more easily point to it and say LOOK!. It might also get people to pursue more ways to have things not get tracked. But, I suspect lots of people would also find it super useful and having Google recommend stuff based on that would seem natural given the interface. As it is now all they use that data for is to serve ads they think you might be interested in. Using that data to recommend pages seems more directly useful to me. Something I want, an article on a topic I'm interested in, vs something they want, to show me ad. And it seems like no loss to them. They'll still get a chance to show me the ad.

Oh well, I expect the majority of people who will respond to this to be anti-Google and so anti this idea. I still think the idea is an interesting one. No site I know of recommends content for me in a way similar to Youtube. I'd like to try it out and see how it goes.


Update

Someone pointed out Chrome for Android and iOS has the "suggested articles" feature but trying it out it completely fails.

First off I turned in on and for me it recommended nothing but Japanese articles. Google knows my browsing history. It knows that 99% of the articles I read are English. The fact that it recommended articles in Japanese shows it completely failed to be anything like the youtube experience I'm suggesting. In fact Google claims the suggestions are based on my Web & App Activity but checking my Web & App Activity there is almost zero Japanese.

Further, there is no method to "Subscribe" to a channel, for whatever definition of "channel". There is nothing showing me articles I've read, though given my rant on Youtube showing me articles I've read maybe that's a good thing? I mean I can go into my account and check my activity but what I want is to be able to go to a page for a specific channel and see the list of all that channel's content and see which articles I've read and which I haven't.

So while it's a step toward providing a youtube like experience it's completely failing to come close to matching it.

Note: I believe "channels" are important. When you watch a youtube video most creators say "Click Subscribe!". It's arguably an important part of the youtube experience and needs to be present if we're trying to see what it would be like to bring that same experience to the web. Most sites already have feeds so this is arguably something relatively easy for Google or whoever is providing this youtube like web experience to implement a "channels" feature.

Comments

Bad UI Design - Youtube

2020-09-25

Today's bad design thoughts - Youtube.

Caveat, maybe I'm full of it and there are reasons for the UI the way it is. I doubt it. ๐Ÿ˜

Youtube's recommendations drive me crazy. I'm sure they have stats or something that says their recommendations are perfect on average but maybe it's possible different people respond better to different kinds of recommendation systems?

As an example some people might like to watch the same videos again and again. Others might only want new videos (me!). So, when my recommendations are 10-50% for videos I've already watched it's a complete waste of time and space.

Here are some recommendations

You can see 2 of them have a red bar underneath. This signifies that I've watched this video. Don't recommend it to me please!!!

But it gets worse. Here's some more recommendations. The highlighted video I've already watched so I don't want it recommended.

I click the 3 dots and tell it "Not interested"

I then have to worry that youtube thinks I hate that channel which is not the case so this appears

Clicking "Tell Us Why" I get this

So I finally choose "I already watched this video".

WHY DID THAT TAKE 4 STEPS!??!?!

  1. click '...'
  2. pick "not interested"
  3. pick "tell us why"
  4. pick "I already watched this video"

It could be 3 steps

  1. click '...'
  2. pick "not interested"
  3. pick "I already watched this video"

It could even be 2 steps

  1. click '...'
  2. pick "I already watched this video"

Why is that 4 steps? What UX guidelines or process decided this needed to be 4 steps? It reminds me of the Windows Off Menu fiasco.

It gets worse though. Youtube effectively calls me a liar!

After those steps above I go to the channel for that user and you'll notice the video I marked as "I already watched the video" is not marked as watched with the read bar.

Imagine if in gmail you marked a message as read but Google decided, nope, we're going to keep it marked as un-read because we know better than you! I get it I guess. The red bar is not a "I watched this already" it's a "how much of this have I watched". Well, if I mark it as watched then mark it as 100% watched!!!

I'm also someone who would prefer to separate music from videos. If I want music I'll go to some music site, maybe even youtube music ๐Ÿคฎ Youtube seems to often fill my recommendations with 10-50% music playlists. STOP IT! You're not getting me to watch more videos (or listen to more music). You're just wasting my time.

Here 5 of 12 recommendations are for music! I'm on YouTUBE to watch things, not listen to things.

Now, maybe some users looking for something to watch end up clicking on 1-2 hr music videos or playlist. Fine, let me turn off all music so I can opt out of it. Pretty please ๐Ÿฅบ I'm happy to go to youtube.com/music or something if I want music from youtube or I'll search for it directly but in general if I to go youtube and I'm looking for recommendations I'm there to watch something.

Please Youtube, let help me help you surface more videos I want to watch. Make it easier to me to tell you I've already watched the video and mark them as watched so when I'm glancing at videos in a channel It's easy to see what I have and haven't watched. Let me separate looking for music from looking for videos. Thank you ๐Ÿ™‡โ€โ™€๏ธ

Comments

Comparing Code Styles

2020-07-03

In a few projects in the past I made these functions

function createElem(tag, attrs = {}) { 
  const elem = document.createElement(tag);
  for (const [key, value] of Object.entries(attrs)) {
    if (typeof value === 'object') {
      for (const [k, v] of Object.entries(value)) {
        elem[key][k] = v;
      }
    } else if (elem[key] === undefined) {
      elem.setAttribute(key, value);
    } else {
      elem[key] = value;
    }
  }
  return elem;
}

function addElem(tag, parent, attrs = {}) {
  const elem = createElem(tag, attrs);
  parent.appendChild(elem);
  return elem;
}

It let's you create an element and fill the various parts of it relatively tersely.

For example

const form = addElem('form', document.body);

const checkbox = addElem('input', form, {
  type: 'checkbox',
  id: 'debug',
  className: 'bullseye',
});

const label = addElem('label', form, {
  for: 'debug',
  textContent: 'debugging on',
  style: {
    background: 'red';
  },
});

With the built in browser API this would be

const form = document.createElement('form');
document.body.appendChild(form);

const checkbox = document.createElement('input');
form.appendChild(checkbox);
checkbox.type = 'checkbox';
checkbox.id = 'debug';
checkbox.className = 'bullseye';

const label = document.createElement('label');
form.appendChild(label);
form.for = 'debug';
form.textContent = 'debugging on';
form.style.background = 'red';

Recently I saw someone post they use a function more like this

function addElem(tag, attrs = {}, children  = []) {
  const elem = createElem(tag, attrs);
  for (const child of children) {
    elem.appendChild(child);
  }
  return elem;
}

The difference to mine was you pass in the children, not the parent. This suggests a nested style like this

document.body.appendChild(addElem('form', {}, [
  addElem('input', {
    type: 'checkbox',
    id: 'debug',
    className: 'bullseye',
  }),
  addElem('label', {
    for: 'debug',
    textContent: 'debugging on',
    style: {
      background: 'red';
    },
  }),
]);

I tried it out recently when refactoring someone else's code. No idea why I decided to refactor but anyway. Here's the original code

function createTableData(thead, tbody) {
  const row = document.createElement('tr');
  {
    const header = document.createElement('th');
    header.className = "text sortcol";
    header.textContent = "Library";
    row.appendChild(header);
    for(const benchmark of Object.keys(testData)) {
      const header = document.createElement('th');
      header.className = "number sortcol";
      header.textContent = benchmark;
      row.appendChild(header);
    };
  }
  {
    const header = document.createElement('th');
    header.className = "number sortcol sortfirstdesc";
    header.textContent = "Average";
    row.appendChild(header);
    thead.appendChild(row);
    for (let i = 0; i < libraries.length; i++) {
      const row = document.createElement('tr');
      row.id = libraries[i] + '_row';
      {
        const data = document.createElement('td');
        data.style.backgroundColor = colors[i];
        data.style.color = '#ffffff';
        data.style.fontWeight = 'normal';
        data.style.fontFamily = 'Arial Black';
        data.textContent = libraries[i];
        row.appendChild(data);
        for(const benchmark of Object.keys(testData)) {
          const data = document.createElement('td');
          data.id = `${benchmark}_${library_to_id(libraries[i])}_data`;
          data.textContent = "";
          row.appendChild(data);
        };
      }
      {
        const data = document.createElement('td');
        data.id = library_to_id(libraries[i]) + '_ave__data'
        data.textContent = "";
        row.appendChild(data);
        tbody.appendChild(row);
      }
    };
  }
}

While that code is verbose it's relatively easy to follow.

Here's the refactor

function createTableData(thead, tbody) {
  thead.appendChild(addElem('tr', {}, [
    addElem('th', {
      className: "text sortcol",
      textContent: "Library",
    }),
    ...Object.keys(testData).map(benchmark => addElem('th', {
      className: "number sortcol",
      textContent: benchmark,
    })),
    addElem('th', {
      className: "number sortcol sortfirstdesc",
      textContent: "Average",
    }),
  ]));
  for (let i = 0; i < libraries.length; i++) {
    tbody.appendChild(addElem('tr', {
      id: `${libraries[i]}_row`,
    }, [
      addElem('td', {
        style: {
          backgroundColor: colors[i],
          color: '#ffffff',
          fontWeight: 'normal',
          fontFamily: 'Arial Black',
        },
        textContent: libraries[i],
      }),
      ...Object.keys(testData).map(benchmark => addElem('td', {
        id: `${benchmark}_${library_to_id(libraries[i])}_data`,
      })),
      addElem('td', {
        id: `${library_to_id(libraries[i])}_ave__data`,
      }),
    ]));
  };
}

I'm not entirely sure I like it better. What I noticed when I was writing it is I found myself having a hard time keeping track of the opening and closing braces, parenthesis, and square brackets. Effectively it's one giant expression instead of multiple individual statements.

Maybe if it was JSX it might hold the same structure but be more readable? Let's assume we could use JSX here then it would be

function createTableData(thead, tbody) {
  thead.appendChild((
    <tr>
      <th className="text sortcol">Library</th>
      (
        Object.keys(testData).map(benchmark => (
          <th className="number sortcol">{benchmark}</th>
        ))
      )
      <th className="number sortcol sortfirstdesc">Average</th>
   </tr>
  ));
  for (let i = 0; i < libraries.length; i++) {
    tbody.appendChild((
      <tr id={`libraries[i]}_row`}>
        <td style={{
          backgroundColor: colors[i],
          color: '#ffffff',
          fontWeight: 'normal',
          fontFamily: 'Arial Black',
        }}>{libraries[i]}</td>
        Object.keys(testData).map(benchmark => (
           <td id={`${benchmark}_${library_to_id(libraries[i])}_data`} />
        ))
        <td id={`${library_to_id(libraries[i])}_ave__data`} />
      </tr>
    ));
  };
}

I really don't know which I like best. I'm sure I don't like the most verbose raw browser API version. The more terse it gets though the harder it seem to be to read it.

Maybe I just need to come up with a better way to format?

I mostly wrote this post only because after the refactor I wasn't sure I was diggin it but writing all this out I have no ideas on how to fix my reservations. I did feel a little like was solving a puzzle unrelated to the task and hand to generate one giant expression.

Maybe my spidey senses are telling me it will be hard to read or edit later? I mean I do try to break down expressions into smaller parts now more than I did in the past. In the past I might have written something like

const dist = Math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2;

but now-a-days I'd be much more likely to write something like

const dx = x2 - x1;
const dy = y2 - y1;
const distSq = dx * dx + dy * dy;
const dist = Math.sqrt(distSq);

Maybe with such a simple equation it's hard to see why I prefer spell it out. Maybe I prefer to spell it out because often I'm writing tutorials. Certainly my younger self thought terseness was "cool" but my older self finds terseness for the sake of terseness to be mis-placed. Readability, understandability, editability, comparability I value over terseness.

hmmm....๐Ÿค”

Comments

OpenGL Trivia

2020-06-10

I am not an OpenGL guru and I'm sure someone who is a guru will protest loudly and rudely in the comments below about a something that's wrong here at some point but ... I effectively wrote an OpenGL ES 2.0 driver for Chrome. During that time I learned a bunch of trivia about OpenGL that I think is probably not common knowledge.

Until OpenGL 3.1 you didn't need to call glGenBuffers, glGenTextures, glGenRenderbuffer, glGenFramebuffers

You still don't need to call them if you're using the compatibility profile.

The spec effectively said that all the glGenXXX functions do is manage numbers for you but it was perfectly fine to make up your own numbers

const id = 123;
glBindBuffer(GL_ARRAY_BUFFER, id);
glBufferData(GL_ARRAY_BUFFER, sizeof(data), data, GL_STATIC_DRAW);

I found this out when running the OpenGL ES 2.0 conformance tests against the implementation in Chrome as they test for it.

Note: I am not suggesting you should not call glGenXXX!. I'm just pointing out the triva that they don't/didn't need to be called.

Texture 0 is the default texture.

You can set it the same as any other texture

glBindTexture(GL_TEXTURE_2D, 0);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 1, 1, 0, GL_RGBA, GL_UNSIGNED_BYTE, oneRedPixel);

Now if you happen to use the default texture it will be red.

I found this out as well when running the OpenGL ES 2.0 conformance tests against the implementation in Chrome as they test for it. It was also a little bit of a disappointment to me that WebGL didn't ship with this feature. I brought it up with the committee when I discovered it but I think people just wanted to ship rather than go back and revisit the spec to make it compatible with OpenGL and OpenGL ES. Especially since this trivia seems not well known and therefore rarely used.

Compiling a shader is not required to fail even if there are errors in your shader.

The spec, at least the ES spec, says that glCompileShader can always return success. The spec only requires that glLinkProgram fail if the shaders are bad.

I found this out as well when running the OpenGL ES 2.0 conformance tests against the implementation in Chrome as they test for it.

This trivia is unlikely to ever matter to you unless you're on some low memory embedded device.

There were no OpenGL conformance tests until 2012-ish

I don't know the actual date but when I was using the OpenGL ES 2.0 conformance tests they were being back ported to OpenGL because there had never been an official set of tests. This is one reason there are so many issues with various OpenGL implementations or at least were in the past. Tests now exist but of course any edge case they miss is almost guaranteed to show inconsistencies across implementations.

This is also a lesson I learned. If you don't have comprehensive conformance tests for your standards, implementations will diverge. Making them comprehensive is hard but if you don't want your standard to devolve into lots of non-standard edge cases then you need to invest the time to make comprehensive conformance tests and do you best to make them easily usable with implementations other than your own. Not just for APIs, file formats are another place comprehensive conformance tests would likely help to keep the non-standard variations at a minimum.

Here are the WebGL2 tests as examples and here are the OpenGL tests. The OpenGL ones were not made public until 2017, 25yrs after OpenGL shipped.

Whether or not fragments get clipped by the viewport is implementation specific

This may or may not be fixed in the spec but it is not fixed in actual implementations. Originally the viewport setting set by glViewport only clipped vertices (and or the triangles they create). but for example, draw a 32x32 size POINTS point say 2 pixels off the edge of the viewport, should the 14 pixels still in the viewport be drawn? NVidia says yes, AMD says no. The OpenGL ES spec says yes, the OpenGL spec says no.

Arguably the answer should be yes otherwise POINTS are entirely useless for any size other than 1.0

POINTS have a max size. That size can be 1.0.

I don't think it's trivia really but it might be. Plenty of projects might use POINTS for particles and they expand the size based on the distance from the camera but it turns out they may never expand or they might be limited to some size like 64x64.

I find this very strange that there is a limit. I can imagine there is/was dedicated hardware to draw points in the past. It's relatively trivial to implemented them yourself using instanced drawing and some trivial math in the vertex shader that has no size limit so I'm surprised that GPUs just don't use that method and not have a size limit.

But whatever, it's how it is. Basically you should not use POINTS if you want consistent behavior.

LINES have a max thickness of 1.0 in core OpenGL

Older OpenGL and therefore the compatibility profile of OpenGL supports lines of various thicknesses although like points above the max thickness was driver/GPU dependant and allowed to be just 1.0. But, in the core spec as of OpenGL 3.0 only 1.0 is allowed period.

The funny thing is the spec still explains how glLineWidth works. It's only buried in the appendix that it doesn't actually work.

E.2.1 Deprecated But Still Supported Features

The following features are deprecated, but still present in the core profile. They may be removed from a future version of OpenGL, and are removed in a forward compatible context implementing the core profile.

  • Wide lines - LineWidth values greater than 1.0 will generate an INVALID_VALUE error.

The point is, except for maybe debugging you probably don't want to use LINES and instead you need to rasterize lines yourself using triangles.

You don't need to setup any attributes or buffers to render.

This comes up from needing to make the smallest repos either to post on stack overflow or to file a bug. Let's assume you're using core OpenGL or OpenGL ES 2.0+ so that you're required to write shaders. Here's the simplest code to test a texture

const GLchar* vsrc = R"(#version 300
void main() {
  gl_Position = vec4(0, 0, 0, 1);
  gl_PointSize = 100.0;
})";

const GLchar* fsrc = R"(#version 300
precision highp float;
uniform sampler2D tex;
out vec4 color;
void main() {
  color = texture(tex, gl_PointCoord);
})";

GLuint prg = someUtilToCompileShadersAndLinkToProgram(vsrc, fsrc);
glUseProgram(prg);

// this block only needed in GL, not GL ES
{
    glEnable(GL_PROGRAM_POINT_SIZE);
    GLuint vertex_array;
    glGenVertexArrays(1, &vertex_array);
    glBindVertexArray(vertex_array);
}

const GLubyte oneRedPixel[] = { 0xFF, 0x00, 0x00, 0xFF };
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 1, 1, 0, GL_RGBA, GL_UNSIGNED_BYTE, oneRedPixel);

glDrawArrays(GL_POINTS, 0, 1);

Note: no attributes, no buffers, and I can test things about textures. If I wanted to try multiple things I can just change the vertex shader to

const GLchar* vsrc = R"(#version 300
layout(location = 0) in vec4 position;
void main() {
  gl_Position = position;
  gl_PointSize = 100.0;
})";

And then use glVertexAttrib to change the position. Example

glVertexAttrib2f(0, -0.5, 0);  // draw on left
glDrawArrays(GL_POINTS, 0, 1);
...
glVertexAttrib2f(0,  0.5, 0);  // draw on right
glDrawArrays(GL_POINTS, 0, 1);

Note that even if we used this second shader and didn't call glVertexAttrib we'd get a point in the center of the viewport. See next item.

PS: This may only work in the core profile.

The default attribute value is 0, 0, 0, 1

I see this all the time. Someone declares a position attribute as vec3 and then manually sets w to 1.

in vec3 position;
uniform mat4 matrix;
void main() {
  gl_Position = matrix * vec4(position, 1);
}

The thing is for attributes w defaults to 1.0 so this will work just as well

in vec4 position;
uniform mat4 matrix;
void main() {
  gl_Position = matrix * position;
}

It doesn't matter that you're only supplying x, y, and z from your attributes. w defaults to 1.

Framebuffers are cheap and you should create more of them rather than modify them.

I'm not sure if this is well known or not. It partly falls out from understanding the API.

A framebuffer is a tiny thing that just consists of a collection of references to textures and renderbuffers. Therefore don't be afraid to make more.

Let's say your doing some multipass post processing where you swap inputs and outputs.

texture A as uniform input => pass 1 shader => texture B attached to framebuffer texture B as uniform input => pass 2 shader => texture A attached to framebuffer texture A as uniform input => pass 3 shader => texture B attached to framebuffer texture B as uniform input => pass 4 shader => texture A attached to framebuffer ...

You can implement this in 2 ways

  1. Make one framebuffer, call gl.framebufferTexture2D to set which texture to render to between passes.

  2. Make 2 framebuffers, attach texture A to one and texture B to the other. Bind the other framebuffer between passes.

Method 2 is better. Every time you change the settings inside a framebuffer the driver potentially has to check a bunch of stuff at render time. Don't change anything and nothing has to be checked again.

This arguably includes glDrawBuffers which is also framebuffer state. If you need multiple settings for glDrawBuffers make a different framebuffer with the same attachments but different glDrawBuffers settings.

Arguably this is likely a trivial optimization. The more important point is framebuffers themselves are cheap.

TexImage2D the API leads to interesting complications

Not too many people seem to be aware of the implications of TexImage2D. Consider that in order to function on the GPU your texture must be setup with the correct number of mip levels. You can set how many. It could be 1 mip. It could be a a bunch but they each have to be the correct size and format. Let's say you have a 8x8 texture and you want to do the standard thing (not setting any other texture or sampler parameters). You'll also need a 4x4 mip, a 2x2 mip, an 1x1 mip. You can get those automatically by uploading the level 0 8x8 mip and calling glGenerateMipmap.

Those 4 mip levels need to copied to the GPU, ideally without wasting a lot of memory. But look at the API. There's nothing in that says I can't do this

glTexImage2D(GL_TEXTURE_2D, 0, 8, 8, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData8x8);
glTexImage2D(GL_TEXTURE_2D, 1, 20, 40, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData40x20);
glTexImage2D(GL_TEXTURE_2D, 2, 10, 20, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData20x10);
glTexImage2D(GL_TEXTURE_2D, 3, 5, 10, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData10x5);
glTexImage2D(GL_TEXTURE_2D, 4, 2, 5, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData5x2);
glTexImage2D(GL_TEXTURE_2D, 5, 1, 2, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData2x1);
glTexImage2D(GL_TEXTURE_2D, 6, 1, 1, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData1x1);

If it's not clear what that code does a normal mipmap looks like this

but the mip chain above looks like this

Now, the texture above will not render but the code is valid, no errors, and, I can fix it by adding this line at the bottom

glTexImage2D(GL_TEXTURE_2D, 0, 40, 80, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData80x40);

I can even do this

glTexImage2D(GL_TEXTURE_2D, 6, 1000, 1000, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData1000x1000);
glTexImage2D(GL_TEXTURE_2D, 6, 1, 1, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData1x1);

or this

glTexImage2D(GL_TEXTURE_2D, 6, 1000, 1000, 0, GL_RGBA, GL_UNSIGNED_BYTE, pixelData1000x1000);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAX_LEVEL, 5);

Do you see the issue? The API can't actually know anything about what you're trying to do until you actually draw. All the data you send to each mip just has to sit around until you call draw because there's no way for the API to know beforehand what state all the mips will be until you finally decide to draw with that texture. Maybe you supply the last mip first. Maybe you supply different internal formats to every mip and then fix them all later.

Ideally you'd specify the level 0 mip and then it would be an error to specify any other mip that does not match. Same internal format, correct size for the current level 0. That still might not be perfect because on changing level 0 all the mips might be the wrong size or format but it could be that changing level 0 to a different size invalidates all the other mip levels.

This is specifically why TexStorage2D was added but TexImage2D is pretty much ingrained at this point

Comments

Reduce Your Dependencies

2019-11-18

I recently wanted to add colored output to a terminal/command line program. I checked some other project that was outputting color and saw they were using a library called chalk.

All else being equal I prefer smaller libraries to larger ones and I prefer to glue libraries together rather than take a library that tries to combine them for me. So, looking around I found chalk, colors, and ansi-colors. All popular libraries to provide colors in the terminal.

chalk is by far the largest with 5 dependencies totaling 3600 lines of code.

Things it combines

Next up is colors. It's about 1500 lines of code.

Like chalk it also spies on your command line arguments.

Next up is ansi-color. It's about 900 lines of code. It claims to be a clone of colors without the excess parts. No auto detecting support. No spying on your command line. It does include the theme function if only to try to match colors API.

Why all these hacks and integrations?

Themes

Starting with themes. chalk gets this one correct. They don't do anything. They just show you that it's trivial to do it yourself.

const theme = {
  cool: chalk.green,
  cold: chalk.blue,
  hot: chalk.red,
};

console.log(theme.hot('on fire'));

Why add a function setTheme just to do that? What happens if I go

colors.theme({
  red: 'green',
  green: 'red',
});

Yes you'd never do that but an API shouldn't be designed to fail. What was the point of cluttering this code with this feature when it's so trivial to do yourself?

Color Names

It would arguably be better to just have them as separate libraries. Let's assume the color libraries have a function rgb that takes an array of 3 values. Then you can do this:

const pencil = require('pencil');
const webColors = require('color-name');

pencil.rgb(webColors.burlywood)('some string');

vs

const chalk = require('chalk');

chalk.keyword('burlywood')('some-string');

In exchange for breaking the dependency you gain the ability to take the newest color set anytime color-name is updated rather than have to wait for chalk to update its deps. You also don't have 150 lines of unused JavaScript in your code if you're not using the feature which you weren't.

Color Conversion

As above the same is true of color conversions

const pencil = require('pencil');
const hsl = require('color-convert').rgb.hsl;

pencil.rgb(hsl(30, 100, 50))('some-string');

vs

const chalk = require('chalk');

chalk.hsl(30, 100, 50)('some-string');

Breaking the dependency 1500 lines are removed from the library that you probably weren't using anyway. You can update the conversion library if there are bugs or new features you want. You can also use other conversions and they won't have a different coding style.

Command Line hacks

As mentioned above chalk looks at your command line behind the scenes. I don't know how to even describe how horrible that is.

A library peeking at your command line behind the scenes seems like a really bad idea. To do this not only is it looking at your command line it's including another library to parse your command line. It has no idea how your command line works. Maybe you're shelling to another program and you have a โ€”- to separate arguments to your program from arguments meant for the program you spawn like Electron and npm. How would chalk know this? To fix this you have to hack around chalk using environment variables. But of course if the program you're shelling to also uses chalk it will inherit the environment variables requiring yet more workarounds. It's just simply a bad idea.

Like the other examples, if your program takes command line arguments it's literally going to be 2 lines to do this yourself. One line to add --color to your list of arguments and one line to use it to configure the color library. Bonus, your command line argument is now documented for your users instead of being some hidden secret.

Detecting a Color Terminal

This is another one where the added dependency only detracts, not adds.

We could just do this:

const colorSupport = require('color-support');
const pencil = require('pencil');

pencil.enabled = colorSupport.hasBasic;

Was that so hard? Instead it chalk tries to guess on its own. There are plenty of situations where it will guess wrong which is why making the user add 2 lines of code is arguably a better design. Only they know when it's appropriate to auto detect.

Issues with Dependencies

There are more issues with dependencies than just aesthetics and bloat though.

Dependencies = Less Flexible

The library has chosen specific solutions. If you need different solutions you now have to work around the hard coded ones

Dependencies = More Risk

Every dependency adds risks.

Dependencies = More Work for You

Every dependency a library uses is one more you have to deal with. Library A gets discontinued. Library B has a security bug. Library C has a data leak. Library D doesn't run in the newest version of node, etcโ€ฆ

If the library you were using didn't depend on A, B, C, and D all of those issues disappear. Less work for you. Less things to monitor. Less notifications of issues.

Lower your Dependencies

I picked on chalk and colors here because they're perfect examples of a poor tradeoffs. Their dependencies take at most 2 lines of code to provide the same functionality with out the dependencies so including them did nothing but add all the issues and risks listed above.

It made more work for every user of chalk since they have to deal with the issues above. It even made more work for the developers of chalk who have to keep the dependencies up to date.

Just like they have a small blurb in their readme on how to implement themes they could have just as easily shown how to do all the other things without the dependencies using just 2 lines of code!

I'm not saying you should never have dependencies. The point is you should evaluate if they are really needed. In the case of chalk it's abundantly clear they were not. If you're adding a library to npm please reduce your dependencies. If it only takes 1 to 3 lines to reproduce the feature without the dependency then just document what to do instead of adding a dep. Your library will be more flexible. You'll expose your users to less risks. You'll make less work for yourself because you won't have to keep updating your deps. You'll make less work for your users because they won't have to keep updating your library just to get new deps.

Less dependencies = Everyone wins!

Comments

What to do about dependencies

2019-10-25

More rants on the dependencies issue

So today I needed to copy a file in a node based JavaScript build step.

Background: For those that don't know it node has a package manager called npm (Node Package Manager). Packages have a package.json file that defines tons of things and that includes a "scripts" section which are effectively just tiny command line strings associated with a keyword.

Examples

"scripts": {
   "build": "make -f makefile",
   "test": "runtest-harness"
}

So you can now type npm run build to run the build script and it will run just as if you had typed make -f makefile.

Other than organizational the biggest plus is that if you have any development dependencies npm will look in those locally installed dependencies to run the commands. This means all your tools can be local to your project. If this project needs lint 1.6 and some other project needs lint 2.9 no worries. Just add the correct version of lint to your development dependencies and npm will run it for you.

But, the issue comes up, I wanted to copy a file. I could use a bigger build system but for small things you can imagine just wanting to use cp as in

"scripts": {
   "build": "make -f makefile && cp a.out dist/MyApp",
   ...

The problem is cp is mac/linux only. If you care about Windows devs being able to build on Windows then you can't use cp. The solution is to add a node based copy command to your development dependencies and then you can use it cross platform

So, I go looking for copy commands. One of the most popular is [cpy-cli]. Here's its dependency tree

โ””โ”€โ”ฌ cpy-cli@2.0.0
  โ”œโ”€โ”ฌ cpy@7.3.0
  โ”‚ โ”œโ”€โ”€ arrify@1.0.1
  โ”‚ โ”œโ”€โ”ฌ cp-file@6.2.0
  โ”‚ โ”‚ โ”œโ”€โ”€ graceful-fs@4.2.3
  โ”‚ โ”‚ โ”œโ”€โ”ฌ make-dir@2.1.0
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ pify@4.0.1 deduped
  โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ semver@5.7.1 deduped
  โ”‚ โ”‚ โ”œโ”€โ”€ nested-error-stacks@2.1.0 deduped
  โ”‚ โ”‚ โ”œโ”€โ”€ pify@4.0.1
  โ”‚ โ”‚ โ””โ”€โ”€ safe-buffer@5.2.0
  โ”‚ โ”œโ”€โ”ฌ globby@9.2.0
  โ”‚ โ”‚ โ”œโ”€โ”ฌ @types/glob@7.1.1
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ @types/events@3.0.0
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ @types/minimatch@3.0.3
  โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ @types/node@12.11.6
  โ”‚ โ”‚ โ”œโ”€โ”ฌ array-union@1.0.2
  โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ array-uniq@1.0.3
  โ”‚ โ”‚ โ”œโ”€โ”ฌ dir-glob@2.2.2
  โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ path-type@3.0.0
  โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ pify@3.0.0
  โ”‚ โ”‚ โ”œโ”€โ”ฌ fast-glob@2.2.7
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ @mrmlnc/readdir-enhanced@2.2.1
  โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ call-me-maybe@1.0.1
  โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ glob-to-regexp@0.3.0
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ @nodelib/fs.stat@1.1.3
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ glob-parent@3.1.0
  โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ is-glob@3.1.0
  โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-extglob@2.1.1 deduped
  โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ path-dirname@1.0.2
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ is-glob@4.0.1
  โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-extglob@2.1.1
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ merge2@1.3.0
  โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ micromatch@3.1.10
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”€ arr-diff@4.0.0
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”€ array-unique@0.3.2
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ braces@2.3.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ arr-flatten@1.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ array-unique@0.3.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ extend-shallow@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ is-extendable@0.1.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ fill-range@4.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ extend-shallow@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ is-number@3.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ kind-of@3.2.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ is-buffer@1.1.6
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ repeat-string@1.6.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”ฌ to-regex-range@2.1.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”œโ”€โ”€ is-number@3.0.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ””โ”€โ”€ repeat-string@1.6.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ isobject@3.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ repeat-element@1.1.3
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ snapdragon@0.8.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ snapdragon-node@2.1.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ define-property@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ is-descriptor@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ is-accessor-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ is-data-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”ฌ snapdragon-util@3.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ””โ”€โ”ฌ kind-of@3.2.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚     โ””โ”€โ”€ is-buffer@1.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ split-string@3.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ extend-shallow@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ to-regex@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ define-property@2.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ is-descriptor@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ is-accessor-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ is-data-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ extend-shallow@3.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ assign-symbols@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”ฌ is-extendable@1.0.1
  โ”‚ โ”‚ โ”‚   โ”‚   โ””โ”€โ”ฌ is-plain-object@2.0.4
  โ”‚ โ”‚ โ”‚   โ”‚     โ””โ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ extglob@2.0.4
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ array-unique@0.3.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ define-property@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”ฌ is-descriptor@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”œโ”€โ”ฌ is-accessor-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”œโ”€โ”ฌ is-data-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ expand-brackets@2.1.4
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ debug@2.6.9 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ define-property@0.2.5
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-descriptor@0.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ extend-shallow@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ posix-character-classes@0.1.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ regex-not@1.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ snapdragon@0.8.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ to-regex@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ extend-shallow@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ fragment-cache@0.2.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ regex-not@1.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ snapdragon@0.8.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ to-regex@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ fragment-cache@0.2.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ map-cache@0.2.2
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”€ kind-of@6.0.2
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ nanomatch@1.2.13
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ arr-diff@4.0.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ array-unique@0.3.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ define-property@2.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ extend-shallow@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ fragment-cache@0.2.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ is-windows@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ object.pick@1.3.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ regex-not@1.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ snapdragon@0.8.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ to-regex@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ object.pick@1.3.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ regex-not@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ extend-shallow@3.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”ฌ safe-regex@1.1.0
  โ”‚ โ”‚ โ”‚   โ”‚   โ””โ”€โ”€ ret@0.1.15
  โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ snapdragon@0.8.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ base@0.11.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ cache-base@1.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ collection-visit@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ map-visit@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ object-visit@1.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ object-visit@1.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ component-emitter@1.3.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ get-value@2.0.6
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ has-value@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ get-value@2.0.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ has-values@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ is-number@3.0.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ kind-of@4.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ is-buffer@1.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ set-value@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ extend-shallow@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ is-plain-object@2.0.4 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ split-string@3.1.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ to-object-path@0.3.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ kind-of@3.2.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ is-buffer@1.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ union-value@1.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ arr-union@3.1.0 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ get-value@2.0.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ set-value@2.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ unset-value@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ has-value@0.3.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ get-value@2.0.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ has-values@0.1.4
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”ฌ isobject@2.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚   โ””โ”€โ”€ isarray@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ class-utils@0.3.6
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ arr-union@3.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ define-property@0.2.5
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-descriptor@0.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ static-extend@0.1.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ define-property@0.2.5
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ is-descriptor@0.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ””โ”€โ”ฌ object-copy@0.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚     โ”œโ”€โ”€ copy-descriptor@0.1.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚     โ”œโ”€โ”ฌ define-property@0.2.5
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚     โ”‚ โ””โ”€โ”€ is-descriptor@0.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚     โ””โ”€โ”ฌ kind-of@3.2.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚       โ””โ”€โ”€ is-buffer@1.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ component-emitter@1.3.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ define-property@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ is-descriptor@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ is-accessor-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”œโ”€โ”ฌ is-data-descriptor@1.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ kind-of@6.0.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ isobject@3.0.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ mixin-deep@1.3.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ for-in@1.0.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ is-extendable@1.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ is-plain-object@2.0.4 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ pascalcase@0.1.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ debug@2.6.9
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ ms@2.0.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ define-property@0.2.5
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”ฌ is-descriptor@0.1.6
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”œโ”€โ”ฌ is-accessor-descriptor@0.1.6
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”‚ โ””โ”€โ”ฌ kind-of@3.2.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”‚   โ””โ”€โ”€ is-buffer@1.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”œโ”€โ”ฌ is-data-descriptor@0.1.4
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”‚ โ””โ”€โ”ฌ kind-of@3.2.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ”‚   โ””โ”€โ”€ is-buffer@1.1.6 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚   โ””โ”€โ”€ kind-of@5.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ extend-shallow@2.0.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ is-extendable@0.1.1 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ map-cache@0.2.2 deduped
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”€ source-map@0.5.7
  โ”‚ โ”‚ โ”‚   โ”‚ โ”œโ”€โ”ฌ source-map-resolve@0.5.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ atob@2.1.2
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ decode-uri-component@0.2.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ resolve-url@0.2.1
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ”œโ”€โ”€ source-map-url@0.4.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ urix@0.1.0
  โ”‚ โ”‚ โ”‚   โ”‚ โ””โ”€โ”€ use@3.1.1
  โ”‚ โ”‚ โ”‚   โ””โ”€โ”ฌ to-regex@3.0.2
  โ”‚ โ”‚ โ”‚     โ”œโ”€โ”€ define-property@2.0.2 deduped
  โ”‚ โ”‚ โ”‚     โ”œโ”€โ”€ extend-shallow@3.0.2 deduped
  โ”‚ โ”‚ โ”‚     โ”œโ”€โ”€ regex-not@1.0.2 deduped
  โ”‚ โ”‚ โ”‚     โ””โ”€โ”€ safe-regex@1.1.0 deduped
  โ”‚ โ”‚ โ”œโ”€โ”ฌ glob@7.1.5
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ fs.realpath@1.0.0
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ inflight@1.0.6
  โ”‚ โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ once@1.4.0 deduped
  โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ wrappy@1.0.2
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”€ inherits@2.0.4
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ minimatch@3.0.4
  โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”ฌ brace-expansion@1.1.11
  โ”‚ โ”‚ โ”‚ โ”‚   โ”œโ”€โ”€ balanced-match@1.0.0
  โ”‚ โ”‚ โ”‚ โ”‚   โ””โ”€โ”€ concat-map@0.0.1
  โ”‚ โ”‚ โ”‚ โ”œโ”€โ”ฌ once@1.4.0
  โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ wrappy@1.0.2 deduped
  โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ path-is-absolute@1.0.1
  โ”‚ โ”‚ โ”œโ”€โ”€ ignore@4.0.6
  โ”‚ โ”‚ โ”œโ”€โ”€ pify@4.0.1 deduped
  โ”‚ โ”‚ โ””โ”€โ”€ slash@2.0.0
  โ”‚ โ””โ”€โ”€ nested-error-stacks@2.1.0
  โ””โ”€โ”ฌ meow@5.0.0
    โ”œโ”€โ”ฌ camelcase-keys@4.2.0
    โ”‚ โ”œโ”€โ”€ camelcase@4.1.0
    โ”‚ โ”œโ”€โ”€ map-obj@2.0.0
    โ”‚ โ””โ”€โ”€ quick-lru@1.1.0
    โ”œโ”€โ”ฌ decamelize-keys@1.1.0
    โ”‚ โ”œโ”€โ”€ decamelize@1.2.0
    โ”‚ โ””โ”€โ”€ map-obj@1.0.1
    โ”œโ”€โ”ฌ loud-rejection@1.6.0
    โ”‚ โ”œโ”€โ”ฌ currently-unhandled@0.4.1
    โ”‚ โ”‚ โ””โ”€โ”€ array-find-index@1.0.2
    โ”‚ โ””โ”€โ”€ signal-exit@3.0.2
    โ”œโ”€โ”ฌ minimist-options@3.0.2
    โ”‚ โ”œโ”€โ”€ arrify@1.0.1 deduped
    โ”‚ โ””โ”€โ”€ is-plain-obj@1.1.0
    โ”œโ”€โ”ฌ normalize-package-data@2.5.0
    โ”‚ โ”œโ”€โ”€ hosted-git-info@2.8.5
    โ”‚ โ”œโ”€โ”ฌ resolve@1.12.0
    โ”‚ โ”‚ โ””โ”€โ”€ path-parse@1.0.6
    โ”‚ โ”œโ”€โ”€ semver@5.7.1
    โ”‚ โ””โ”€โ”ฌ validate-npm-package-license@3.0.4
    โ”‚   โ”œโ”€โ”ฌ spdx-correct@3.1.0
    โ”‚   โ”‚ โ”œโ”€โ”€ spdx-expression-parse@3.0.0 deduped
    โ”‚   โ”‚ โ””โ”€โ”€ spdx-license-ids@3.0.5
    โ”‚   โ””โ”€โ”ฌ spdx-expression-parse@3.0.0
    โ”‚     โ”œโ”€โ”€ spdx-exceptions@2.2.0
    โ”‚     โ””โ”€โ”€ spdx-license-ids@3.0.5 deduped
    โ”œโ”€โ”ฌ read-pkg-up@3.0.0
    โ”‚ โ”œโ”€โ”ฌ find-up@2.1.0
    โ”‚ โ”‚ โ””โ”€โ”ฌ locate-path@2.0.0
    โ”‚ โ”‚   โ”œโ”€โ”ฌ p-locate@2.0.0
    โ”‚ โ”‚   โ”‚ โ””โ”€โ”ฌ p-limit@1.3.0
    โ”‚ โ”‚   โ”‚   โ””โ”€โ”€ p-try@1.0.0
    โ”‚ โ”‚   โ””โ”€โ”€ path-exists@3.0.0
    โ”‚ โ””โ”€โ”ฌ read-pkg@3.0.0
    โ”‚   โ”œโ”€โ”ฌ load-json-file@4.0.0
    โ”‚   โ”‚ โ”œโ”€โ”€ graceful-fs@4.2.3 deduped
    โ”‚   โ”‚ โ”œโ”€โ”ฌ parse-json@4.0.0
    โ”‚   โ”‚ โ”‚ โ”œโ”€โ”ฌ error-ex@1.3.2
    โ”‚   โ”‚ โ”‚ โ”‚ โ””โ”€โ”€ is-arrayish@0.2.1
    โ”‚   โ”‚ โ”‚ โ””โ”€โ”€ json-parse-better-errors@1.0.2
    โ”‚   โ”‚ โ”œโ”€โ”€ pify@3.0.0
    โ”‚   โ”‚ โ””โ”€โ”€ strip-bom@3.0.0
    โ”‚   โ”œโ”€โ”€ normalize-package-data@2.5.0 deduped
    โ”‚   โ””โ”€โ”€ path-type@3.0.0 deduped
    โ”œโ”€โ”ฌ redent@2.0.0
    โ”‚ โ”œโ”€โ”€ indent-string@3.2.0
    โ”‚ โ””โ”€โ”€ strip-indent@2.0.0
    โ”œโ”€โ”€ trim-newlines@2.0.0
    โ””โ”€โ”ฌ yargs-parser@10.1.0
      โ””โ”€โ”€ camelcase@4.1.0 deduped

Yea, what the actually Effing F!?

197 dependencies, 1170 files, 47000 lines of JavaScript to copy files.

I ended up writing my own. There's the entire program

const fs = require('fs');
const src = process.argv[2];
const dst = process.argv[3];
fs.copyFileSync(src, dst);

And I added it to my build like this

"scripts": {
   "build": "make -f makefile && node copy.js a.out dist/MyApp",
   ...

So, my first reaction was, yea, something is massively over engineered. Or maybe that's under engineered if by under engineered it means "made without thinking".

You might think so what, people have large hard drives, fast internet, lots of memory. Who cares about dependencies? Well, the more dependencies you have the more you get messages like this

found 35 vulnerabilities (1 low, 2 moderate, 31 high, 1 critical) in 1668 scanned packages

You get more and more and more maintenance with more dependencies.

Not only that, you get dependent, not just on the software but on the people maintaining that software. Above, 197 dependencies also means trusting none of them are doing anything bad. As far as we know one of those dependencies could easily have a time bomb waiting until some day in the future to pown your machine or server.

On the other hand my copy copies a single file. cpy-cli copies similar to cp. It can copy multiple files and whole trees.

I started wondering what it would take to add the minimal features to reproduce a functional cp clone. Note: not a full clone, a functional clone I'm sure cp has a million features but in my entire 40yr career I've only used about 2 of those features. (1) copying using wildcard as in cp *.txt dst which honestly is handled by the shell, not cp. (2) copying recursively cp -R src dst.

The first thing I did was look at a command line argument library. I've used one called optionator in the past and it's fine. I check and it has several dependencies. 2 that stick out are:

  1. a wordwrap library.

    This is used to make your command's help fit the size of the terminal you're in. Definitely a useful feature. I have terminals of all difference sizes. I default to having 4 open.

  2. a levenshtein distance library.

    This is used so that if you specify a switch that doesn't exist it can try to suggest the correct one. For example might type:

       my-copy-clone --src=abc.txt -destinatoin=def.txt
       

    and it would says something like

       no such switch: 'destinatoin' did you mean 'destination'?`. 
       

    Yea, that's kind of useful too.

Okay so my 4 line copy.js just got 3500 lines of libraries added. Or maybe I should look into another library that uses less deps while getting "woke" about dependencies.

Meh, I decide to parse my own arguments rather that take 3500 lines of code and 7 dependencies. Here's the code

#!/usr/bin/env node

'use strict';

const fs = require('fs');
const ldcp = require('../src/ldcp');

const args = process.argv.slice(2);

const options = {
  recurse: false,
  dryRun: false,
  verbose: false,
};

while (args.length && args[0].startsWith('-')) {
  const opt = args.shift();
  switch (opt) {
    case '-v':
    case '--verbose':
       options.verbose = true;
       break;
    case '--dry-run':
       options.dryRun = true;
       options.verbose = true;
       break;
    case '-R':
       options.recurse = true;
       break;
    default:
       console.error('illegal option:', opt);
       printUsage();
  }
}

function printUsage() {
  console.log('usage: ldcp [-R] src_file dst_file\n       ldcp [-R] src_file ... dst_dir');
  process.exit(1);
}


let dst = args.pop();
if (args.length < 1) {
  printUsage();
}

Now that the args are parsed we need a function to copy the files

const path = require('path');
const fs = require('fs');

const defaultAPI = {
  copyFileSync(...args) { return fs.copyFileSync(...args) },
  mkdirSync(...args) { return fs.mkdirSync(...args); },
  statSync(...args) { return fs.statSync(...args); },
  readdirSync(...args) { return fs.readdirSync(...args); },
  log() {},
};

function ldcp(_srcs, dst, options, api = defaultAPI) {
  const {recurse} = options;

  // check if dst is or needs to be a directory
  const dstStat = safeStat(dst);
  let isDstDirectory = false;
  let needMakeDir = false;
  if (dstStat) {
    isDstDirectory = dstStat.isDirectory();
  } else {
    isDstDirectory = recurse;
    needMakeDir = recurse;
  }

  if (!recurse && _srcs.length > 1 && !isDstDirectory) {
    throw new Error('can not copy multiple files to same dst file');
  }

  const srcs = [];

  // handle the case where src ends with / like cp
  for (const src of _srcs) {
    if (recurse) {
      const srcStat = safeStat(src);
      if ((needMakeDir && srcStat && srcStat.isDirectory()) ||
          (src.endsWith('/') || src.endsWith('\\'))) {
        srcs.push(...api.readdirSync(src).map(f => path.join(src, f)));
        continue;
      }
    }
    srcs.push(src);
  }

  const srcDsts = [{srcs, dst, isDstDirectory, needMakeDir}];

  while (srcDsts.length) {
    const {srcs, dst, isDstDirectory, needMakeDir} = srcDsts.shift();

    if (needMakeDir) {
      api.log('mkdir', dst);
      api.mkdirSync(dst);
    }

    for (const src of srcs) {
      const dstFilename = isDstDirectory ? path.join(dst, path.basename(src)) : dst;
      if (recurse) {
        const srcStat = api.statSync(src);
        if (srcStat.isDirectory()) {
          srcDsts.push({
              srcs: api.readdirSync(src).map(f => path.join(src, f)),
              dst: path.join(dst, path.basename(src)),
              isDstDirectory: true,
              needMakeDir: true,
          });
          continue;
        }
      }
      api.log('copy', src, dstFilename);
      api.copyFileSync(src, dstFilename);
    }
  }

  function safeStat(filename) {
    try {
      return api.statSync(filename.replace(/(\\|\/)$/, ''));
    } catch (e) {
      //
    }
  }
}

I made it so you pass an optional API of all the external functions it calls. That way you can pass in for example functions that do nothing if you want to test it. Or you can pass in graceful-fs if that's your jam but in the interest of NOT adding dependencies if you want that that's on you. Simple!

All that's left is using it after parsing the args

const log = options.verbose ? console.log.bind(console) : () => {};
const api = options.dryRun ? {
  copyFileSync(src) { fs.statSync(src) },
  mkdirSync() { },
  statSync(...args) { return fs.statSync(...args); },
  readdirSync(...args) { return fs.readdirSync(...args); },
  log,
} : {
  copyFileSync(...args) { return fs.copyFileSync(...args) },
  mkdirSync(...args) { return fs.mkdirSync(...args); },
  statSync(...args) { return fs.statSync(...args); },
  readdirSync(...args) { return fs.readdirSync(...args); },
  log,
};

ldcp(args, dst, options, api);

Total lines: 176 and 0 dependencies.

It's here if you want it.

Comments

10 Things Apple Could do to Increase Privacy.

2019-07-22

Apple under Tim Cook is staking out the claim that they are "the Privacy company".

Apple products are designed to protect your privacy.

At Apple, we believe privacy is a fundamental human right.

Here's 10 things they could do to actually honor that mission.

1. Disallow Apps from using the camera directly.

This one is problematic but ... the majority of apps that ask to use your camera do not actually need access to your camera. Examples are the Facebook App, The Messenger App, the Twitter App. Even the Instagram App. Instead Apple could change their APIs such that the app asks for a camera picture and the OS takes the picture.

This removes the need to for those apps to have access to the camera at all. The only thing the app would get is the picture you took using the built in camera functionality controlled by the OS itself. If you don't take a picture and pick "Use Picture" then the app never sees anything.

As it is now you really have no idea what the app is doing. When you are in the Facebook app, once you've given the app permission to use the camera then as far as you know the app is streaming video, or pictures to Facebook constantly. You have absolutely no idea.

By changing the API so that the app is required to ask the OS for a photo that problem would be solved.

The problem with this solution is it doesn't cover streaming video since in that case the app needs the constant video. It also doesn't cover unique apps that do special things with the camera.

One solution to the unique camera feature issue would be app store rules. Basically only "camera" apps would be allowed to use the camera directly. SNS apps and other apps that just need a photo would be rejected if they asked for camera permission instead of asking the OS for a photo.

Another solution might be that the OS always ask the user for permission to use the camera (or at least provide the option). In other words if you are in some app like the Instagram app and you click the "take a photo" image the OS asks you "Allow App To Use The Camera?" each and every time. As it is now it only asks once. For those people that are privacy conscious being able to give the app each and every time would prevent spying.

2. Disallow Apps from using the Mic directly

See previous paragraph just replace every instance of "camera" with "mic"

3. Disallow access to all Photos

This is similar to the two above but, as it is now apps like the Facebook App, Twitter, etc will ask for permission to access your photos. They do this so they can provide an interface to let you choose photos to post on facebook or tweet on twitter.

The problem is the moment you give them permission they can immediately look at ALL of your photos. All of them!

It would be better if Apple changed the API so the app asks the OS to ask you to choose 1 or more photos. The OS would then present an interface to choose 1 or more photos at which point only those photos you chose are given to the app.

That way apps could not read all of your photos.

Note that I get that some apps also want permission to read all your photos to enable to upload all of them automatically as you take them. That fine, it should just be a separate permission and Apple should enforce that features that let you choose photos to upload go through the OSes photo chooser and that apps that want full permission to access all photos for things like backup must also function without that permission when selecting photos for other purposes.

4. Let GPS be one time only

There are 3 options for GPS currently

  1. Let the app use GPS always
  2. Let the app use GPS when active
  3. Disallow GPS

There needs to a 4th

  1. Ask for permission each time

As it is, basically if you give an app permission to use GPS at all then every time you access that app it gets to know where you are.

It would be much more privacy oriented if you could choose to only give it GPS access for a moment, next 5 minutes, next 30 minutes, etc...

As it is now if you're privacy conscious you have to dig deep into the settings app for the privacy options. Give an app permission for GPS, then remember to dig through those options again to turn GPS permission back off a few minutes later.

That's not a very privacy oriented design.

5. Disallow apps from implementing an internal web browser.

Many apps show links to websites. For example Twitter or Facebook or the Google Maps app. When you click the links those apps open a web browser directly inside their app.

This means they can spy on everything you do in that web browser. That's not privacy oriented.

Apple should disallow having an internal web browser. They could do this by enforcing a policy that you can only make an app that can access all websites if that app is a web browser app. Otherwise you have to list the sites your app is allowed to access and that list has to be relatively small.

Many apps are actually just an app that goes directly to some company's website which is fine. The app can list company.com or *.company.com as the sites it accesses. Otherwise it's not allowed to access any other websites.

This would force apps to launch the user's browser when they click a link which would mean the apps could no longer spy on your browser activity. The most the could do is know the link you clicked. The couldn't know every link you click after that nor could the log everything you enter on every website you visit while in their app as they can do now.

Note that this would also be better user experience IMO. Users are used to the features available in their browser. For example being able to search in a page. Being able to turn on reader mode. Being able to bookmark and have those bookmarks sync. Being able to use an ad blocker. Etc... As it is when an app uses an internal web browser all of these features are not available. It's inconsistent and inconvenient for the user. By forcing apps to launch the user's browser all of that is solved.

Note: Apple should also allow setting a default browser so that users can choose Firefox or Brave or Chrome or whatever browser the choose for the features they want. If I use Firefox on my Mac I want to be able to bookmark things on iOS and have those bookmarks synced to my Mac but that becomes cumbersome if the OS keeps launching Safari instead of Firefox or whatever my browser of choice is.

6. Put a light on the camera/mic?

In Japan there is a law that phone cameras must make a shutter noise. I actually despise that law. I want to be able to take pictures of my delicious gourmet meal in a quiet fancy restaurant without alerting and annoying all the other guests that I'm doing so. Japan claims this is to prevent perverts from taking up skirt pictures but perverts can just buy non-phone cameras and they can use an app because apps are not bound by the same laws so in effect this law does absolutely nothing except make it annoying and embarrassing to take pictures in quiet places.

On the other hand, if there was a small green, or orange light next to the camera that was physically connected to the camera's power so that it came on when the camera is on then I'd know when the camera was in use which would at least be a privacy oriented feature and so unlike the law above it would have a point.

If they wanted to be cute they could use a multi-color LED where red = camera is on, green = mic is on, yellow = both are on.

Let me add, I wish Apple devices had a built in camera cover or at least the Macs. I know you can buy a 3rd party one but adding a built in cover would show Apple is serious above Privacy.

7. Disallow scanning WiFi / Bluetooth for most apps

AFAIK any app can scan WiFi and or bluetooth. Apps can use this info to know your location even if you have GPS off.

Basically there are databases of every WiFi's SSID (the name you pick to connect to a WiFi hotspot/router) and the databases also have recorded that WiFi's GPS so if they know which WiFis are near then they basically know where you are.

Here's a website where you can see what I'm talking about. Zoom in anywhere in the world and it will show the known WiFi hotspots / routers.

https://wigle.net/

Why do most apps need this ability? They don't? Why doesn't Apple disallow it for most apps?

There are exceptions. I have a Wifi scanner app and a WiFi signal strength app and even a Bluetooth scanner and testing app that are very useful but Apple could easily have an App Store policy that only network utilities are allowed to use this powerful spying feature.

There is absolutely no reason the Twitter app or the Facebook app need to be able to see WiFi SSIDs nor local bluetooth devices.

Apple could easily add a permission requirement to use these features and only allow select apps have them. OR they could add it as yet another per app privacy setting.

8. Allow more Browser engines

This one is probably the most controversial suggestion here. The reasoning though goes like this

Safari is not even remotely the most secure browser.

This is provable by looking through the National Vulnerability Database (NVD) run by the National Institute of Standards and Technology (NIST)

In it you can see that while all browsers have around the same amount of vulnerabilities the types of vulnerabilities are different. Some browsers are designed to be more secure and so are less likely to have vulnerabilities that compromises your device and therefore your privacy. To put it slight more concretely 2 browsers might both have 150 vulnerabilities a year but one might have 90% code execution vulnerabilities (your device and data are compromised) and the other might have 90% DOS vulnerabilities (your device slows down or freezes but no data is compromised). If you check the database you'll find it's true that some browsers have orders of magnitude more code execution vulnerabilities than others.

By allowing competing browser engines users would have the choice to run those empirically more secure browser engines.

As it is now Safari has zero competition on iOS. A developer can make a new browser but it's really just Safari with a skin. That means Apple has less competition and so there is less pressure to make Safari better.

Allowing competing browsers engines would both be win for privacy and encourage faster and more development of Safari.

The number 1 objection I hear is that allowing other engines is a security issue but that is also provably false. See the NVD above. Other engines are more secure. By disallowing other engines you prevent users from protecting themselves from being hacked and therefore having their privacy invaded.

Another objection I hear is JITing, turing JavaScript into code, is something only Apple should be able to do. That argument basically boils down to Apple's app sandbox is insecure and that all apps must be 100% perfect or else they can escape the sandbox. You can't have it both ways. Either Apple's app sandbox is insecure and therefore the whole product is insecure OR Apple's app sandbox is secure and therefore allowing JITing doesn't affect that security. Now of course Apple's app sandbox could have bugs but those bugs can be exploited by any app. The solution is for Apple to be diligent and fix the bugs quickly and timely. The solution is not to make up some bogus JIT restriction.

To make an analogy if a product advertises as waterproof then it better actually waterproof. It can't come with some disclaimer that says "waterproof to 100meters but don't actually put this product in water as it might break".

The JIT argument is basically the same. "Our app sandbox is secure but don't actually run any code". It's clear the JIT argument is bogus. It's exists only to allow Apple a monopoly on browsers on iOS so they don't have to compete and so they can wield veto power over all browser standards. Since only they can make new browser features available to their 1.4 billion iOS devices if they don't support a feature it might as well not exist. Since devs can't use the feature with those 1.4 billion devices they generally just avoid the feature altogether even on non iOS devices.

All that is the long way of saying users would be more secure and get better privacy if they could run more secure and more privacy oriented browsers.

9. Lower the price of Apple products or come out with cheaper alternatives

Apple fans won't like this reason. I don't consider myself an Apple fan and yet I own a Macbook Air, a Macbook Pro, a Mac Mini, an iPad Air 3rd Generation, an iPhone6+, an iPhoneX, an Apple TV 4 and at one point I also owned late 2018 iPad Pro and 4th Gen Apple Watch so clearly I also like Apple even if I don't consider myself a fanatic.

The thing is Apple is expensive. People will argue Apple's quality is high and worth the price and that might be true but it's kind of beside the point. You could make the argument a BMW or Mercedes Benz is a higher quality car than a Kia or a Hyundai but someone who only has a budget for a used Kia or Hyundai it's not realistic to ask them to buy an BMW or Mercedes

Similarly if you have a family of 4 and you want to give everyone in the family their own laptop computer you can buy 4 Windows laptops for the price of the cheapest Mac laptop. Sure those $200-$300 laptops are not nearly as nice as a Macbook Air but just like a Kia will still get you to your job a $250 Windows laptop will still let you browse the internet, run Microsoft Word, Illustrator, Photoshop, listen to music, watch youtube, edit a blog, read reddit, learn to program, etc.... It's unrealistic to ask a family of four to spend $4400 for 4 mac laptops instead of $1200 for 4 windows laptops.

Now you might be thinking so whatโ€ฆ people who can afford should be able to spend their money on whatever they want. That's no different than anything else. Rich people buy penthouses just off Central Park and poor people live in trailer parks. The difference though is for most expensive things there are functionally equivalent inexpensive alternatives. A Kia will get you to work just as well as a BMW. Cheap clothing from Old Navy or Uniqlo or H&M will cloth you just as well as clothing from Versace or Prada or Louis Vuitton or pick you favorite but expensive brand. The food at Applebees will feed you just as well as the food from French Laundry. A $250 Vizio TV will let you watch TV just as functionally as a $4000 Sony.

But, if Apple really is the only privacy oriented option, if Android and Windows don't take your privacy seriously, then Apple being out of reach of so many people is โ€ฆ well I don't know what words to use basically say that people that can't afford Apple don't deserve privacy.

Of course that's not Apple's fault. Microsoft for Windows and Google for Android could step up and start making their OSes stop sucking for privacy.

My only point is if Apple is "the privacy company" then at the moment they are really "the privacy company for non-poor people" only and that they could be the privacy company for everyone if they offered some more affordable alternatives.

10. Stop asking for passwords to repair

If you take your Apple device into repair they will ask you for your password or passcode. What the actual Effing Eff!??? Privacy? What? What's that? No, give us the password that unlocks all of your bank accounts, shopping accounts, bitcoin accounts, etc. Give us the password that lets us look at all your photos and videos. Give us the password that gives us access to the email on your device so that we can use that to open all other accounts by asking for password resets. Give us the password for the device that has all your two-factor codes and apps that confirm login on various services.

This is Apple's default stance. If you take a device in for service they will ask you for your password or passcode. That is not the kind of policy a privacy first company would have!

If you object they might tell you to change your password to something else and then change it back after you've gotten the repaired device back. That helps them not to know your normal password. It doesn't prevent all the stuff above.

If it's a Mac they'll give you the option to either turn on the guest account or add another account for them to login. Unfortunately that's really no better. If you're actually privacy oriented you'll have encrypted the hard drive. Giving them a password that unlocks the drive effectively gives them access to all your data whether or not a particular account has access to that data.

You can opt out of that too in which case they'll basically throw up their hands and say "In that case we may not be able to confirm the repairs". Another option is you can format the drive before giving it to them. Is that really the only option a privacy orient company should give you?

Now I get it, I'm sympathetic to the fact that it's harder for them if you don't give them the password. Still, for a Mac they can plug in an external drive and boot off that and at least confirm the machine itself is fine. For an iOS device, if they really are a "Privacy First" company then they need to find another way. They need to design a way to service the products that doesn't risk your privacy and risk exposing all your data.

Do I trust Apple as a company? Mostly. Do I trust every $15 an hour employee at the store like the one asking for password? No! Do I even trust some repair technician making more money but who may be getting paid on the side to scoop up login credentials? No! Do I know they destroy the info when the repair is finished? Nope! They ask you to write it down. As far a I know I could go dig through the trash behind an Apple store and find tons of credentials. Also as far as I know it's all stored in their service database ready to be stolen or hacked.

A privacy first company would do something different. They might for example backup your entire hard drive or iDevice, then reformat it, work on it, then restore. They might put it all on a USB drive, and hand the drive to you, you bring it back when they're done with any physical repairs and they restore it then and reformat the drive. If that's too slow then that's just incentive for them to make it faster. The might add some special service partition or service mode they can boot devices into.

The point is, a company that claims to take privacy seriously shouldn't be asking you to tell them the single most important password in your life. The password that unlocks all other passwords.


I'm not really hopeful Apple will make these changes but I'd argue if they don't make them then their statements of

Apple products are designed to protect your privacy.

At Apple, we believe privacy is a fundamental human right.

Is really just marketing and not at all real. Let's hope it is real and they take more steps to increase user privacy.

Comments
older