Nume MacAroon Ⓥ @nm

1 post1 participant0 posts today

4d

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

We ported #Microvium to #CHERIoT several years ago, but never got around to writing up the details. For anyone interested in how CHERIoT can help you extend safe-language guarantees across an FFI boundary.

CHERIoT Platform · 5dCHERIoT and MicroviumWe’ve included a port of the Microvium embedded JavaScript runtime. We originally did this port even before we open sourced the CHERIoT project We haven’t talked about it much and that’s something of an omission, since it is quite a nice case study in supporting a managed language on a CHERI platform

Apr 1

Apr 1

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

One of the reasons I'm still using GitHub for a lot of stuff is the free CI, but I hadn't really realised how little that actually costs. For #CHERIoT #LLVM, we're using Cirrus-CI with a 'bring your own cloud subscription' thing. We set up ccache backed by a cloud storage thing, so incremental builds are fast. The bill for last month? £0.31.

We'll probably pay more as we hire more developers, but I doubt it will cost more than £10/month even with an active team and external contributors. Each CI run costs almost a rounding-error amount, and that's doing a clean (+ ccache) build of LLVM and running the test suite. We're using Google's Arm instances, which have amazingly good price:performance (much better than the x86 ones) for all CI, and just building the x86-64 releases on x86-64 hardware (we do x86-64 and AArch64 builds to pull into our dev container).

For personal stuff, I doubt the CI that I use costs more than £0.10/month at this kind of price. There's a real market for a cloud provider that focuses on scaling down more than on scaling up and made it easy to deploy this kind of thing (we spent far more money on the developer time to figure out the nightmare GCE web interface than we've spent on the compute. It's almost as bad as Azure and seems to be designed by the same set of creatures who have never actually met a human).

Apr 1

Apr 1

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

@me_ The #CHERIoT folks who are in town will be heading to the Maypole at 6ish for beer and pizza before CHERI Blossoms. You're welcome to join us!

Maypole FreehouseThe Maypole Freehouse | Award-Winning Pub in Cambridge | Real Ales & Italian FoodThe Maypole Freehouse – an award-winning, family-run pub in Cambridge since 1982. Serving real ales, craft beer, authentic Italian food, and handcrafted cocktails. Enjoy our spacious beer garden and welcoming atmosphere. CAMRA Pub of the Year 2018.

Mar 17

Mar 17

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

Tor Jeremiassen (Google) has written up a post on how the MPACT simulator for #CHERIoT works.

This is the fastest simulator that we currently ship, it manages around 20 MIPS on my laptop (compared to around 400 KIPS from the one generated from the formal model of the ISA). This model is designed to plug into larger system simulations, so expect to see full SoC emulators based on it!

CHERIoT Platform · Mar 17MPACT-Cheriot - A Fast and flexible simulator for the CHERIoT architectureThe Cybersecurity and Infrastructure Security Agency (CISA) has identified memory safety vulnerabilities as a major cybersecurity risk 1, pointing to reports that 70% or more of security vulnerabilities were found to involve memory-safety issues. CHERIoT 2 is a new architecture that seeks to provide strong protection against many frequently exploited memory vulnerabilities. CHERIoT is based on using CHERI 3 capability hardware-extensions to a 32 bit RISC-V 4 platform to provide fine-grained spatial and temporal safety, along with lightweight software compartmentalization. Standards like these are critical to broad adoption of security features 5, but companies implementing these also need a firm understanding of the performance tradeoffs of their implementations. In this article I introduce MPACT-Cheriot 6, an instruction-set simulator (ISS) that is designed to be extensible, easily modifiable, and fast, in order to enable rapid prototyping and debugging of the new architecture. MPACT-Cheriot is built on top of MPACT-RiscV 7, a RISC-V ISS built using the MPACT-Sim framework 8, all developed at Google and available on Github. Bob Lord, “The Urgent Need for Memory Safety in Software Products”, CISA blog, December 2023. https://www.cisa.gov/news-events/news/urgent-need-memory-safety-software-products. ↩ Amar, Saar, et al. “CHERIoT: Complete Memory Safety for Embedded Devices.” Proceedings of the 56th IEEE/ACM International Symposium on Microarchitecture, 2023. ↩ Watson, Robert N. M., et al. “An Introduction to CHERI.” Technical Report, University of Cambridge, Computer Laboratory, no. 941, 2019, https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-941.pdf ↩ RISC-V International, http://community.riscv.org. ↩ Robert N.M. Watson, et al. “It Is Time to Standardize Principles and Practices for Software Memory Safety”, Communications of the ACM, volume 68, issue 2, pages 40-45. https://dl.acm.org/doi/10.1145/3708553 ↩ Jeremiassen, Tor. “MPACT-Cheriot.”, March 2024, https://github.com/google/mpact-cheriot. ↩ Jeremiassen, Tor. “MPACT-RiscV.”, March 2023, https://github.com/google/mpact-riscv. ↩ Jeremiassen, Tor. “MPACT-Sim - Retargetable Instruction Set Simulator Infrastructure.”, March 2023, https://github.com/google/mpact-sim. ↩

#cheriot

Mar 14

Mar 14

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

@CHERIoT Great to see folks adopting #CHERIoT and learning to think about writing secure software as a side effect.

Mar 14

Mar 14

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

I learned a new acronym this week: SOUP (software of unknown provenance). Unlike mushroom-identification AI, #CHERIoT makes your SOUP safe to consume.

Mar 9

Mar 9

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

The recent news about an alleged backdoor in #ESP32 chips is largely overhyped, but it’s a good opportunity to explain how we designed #CHERIoT to make this kind of attack almost impossible by construction.

SCI and the CHERI Alliance will both be at Embedded World next week. Come and talk to us about CHERIoT and how you can adopt it with SCI’s ICENI chips in your next products.

CHERIoT Platform · Mar 9Why the alleged ESP32 backdoor couldn’t happen hereIf you’ve been following the news this weekend, you’ll have seen articles about a vulnerability (alleged to be an intentional backdoor) in ESP32 microcontrollers. The news is somewhat overhyped (the attacks probably require physical access) but it provides an opportunity to look at what we did in CHERIoT to eliminate this class of attack by construction.

Mar 7

Mar 7

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

@edwintorok

I assume that trap is a default fatal signal like SIGSEGV/SIGBUS

On CheriBSD, it's SIGPROT, which is similar.

On #CHERIoT we have per-compartment error handlers, which are normally used to do a little bit of stack unwinding on fault. See: https://cheriot.org/book/compartments.html#_using_scoped_error_handling

cheriot.orgCHERIoT Programmers' Guide

Mar 7

Mar 7

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

@edwintorok

My experience with embedded code on #CHERIoT is that memory-safety errors are pretty rare for mature codebases.

We found something similar with the original CheriBSD work: code that had run on a lot of different targets had often had memory-safety errors triggered deterministically on one of them. I found this 20 years ago with some of my own code, where some latent bugs were trivial to reproduce the first time someone tried running the code on and UltraSPARC. It's also why I still regard 'runs on a load of architectures and operating systems' as a quality badge and don't like to trust software that claims to be portable but supports on the 'big three'.

The variation in embedded targets is far greater than the variation in more mainstream software. AArch32, x86, and SPARC64 are basically identical on a spectrum that contains things like PIC24. If your code has run on a bunch of Harvard-architecture embedded devices, you've caught all of the memory-safety bugs that happen on the happy paths. The only ones left are the ones on error-handling paths.

If you're shipping safety-critical code, you should have test vectors that cover all of the error-handling conditions. Run that once on a CHERI system and you'll be told exactly where the memory-safety bugs are.

Jan 30

Jan 30

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

Sealing remains one of my favourite features of #CHERI, and is a core part of any compartmentalisation model that's usable by programmers. In the #CHERIoT compiler, we've improved it by surfacing sealing in the C/C++ type system.

CHERIoT Platform · Jan 30Introducing sealed typesSealing is one of the most important parts of CHERI because it enables usable compartmentalised interfaces. Sealing lets you build type-safe opaque types that are safe in the presence of mutual distrust and delegation. In the most recent updates to the compiler and RTOS, we’ve made this even more friendly for programmers.

Jan 21 *

Jan 21 *

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

EDIT: After lots of helpful feedback in this thread, the article is now accepted. Thanks everyone who helped!

Any #Wikipedia editors around who can help? We are trying to get the article on #CHERI added. It's so far been rejected three times:

First, it did not have enough independent citations. We added a lot to news articles about CHERI.

Second, it was insufficiently detailed and lacking context. We added a timeline of development, a load of cross references, and a simple introduction.

It was then rejected again because it lacks an explanation that a 15-year-old could understand. This is true of 90% of science-related articles on Wikipedia, so I'm not sure how we fix it. An explanation at that level is something I can write (I have done for the #CHERIoT book!) but it would then make the page 3-4 times as long and not suitable for an encyclopaedia (I've previously seen pages rejected because Wikipedia is not the right place for tutorials).

I don't understand the standards for Wikipedia and I really need some guidance for how to resolve and progress this.

en.wikipedia.orgCapability Hardware Enhanced RISC Instructions - Wikipedia

#cheri

Jan 19

Jan 19

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

Some #CHERI philosophy for the weekend. Deconflating identifiers and capabilities has enormous security value in any setting and is an approach that #CHERIoT applies at all levels of the system, from memory access up to network connections.

CHERIoT Platform · Jan 19Identifiers are not capabilitiesA lot of the security of CHERI and the higher-level abstractions that we have built in the CHERIoT platform come from one very simple idea: deconflating identifiers and capabilities. This conflation is so common and so entrenched in other systems that it will take a little while to explain what that means.

Jan 15 *

Jan 15 *

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

EDIT: The typeset draft is online which has the acknowledgements in the Preface.

I've now got a hopefully-final version of the draft cover for the #CHERIoT Programmers' Guide.

Please can the folks who contributed cats check that I have the acknowledgement (and cat names) right? Here is the draft text:

The cats on the cover represent safe, secure, compartmentalisation (what is safer or more secure than a cat in a box?).
Each cat is in a separate, isolated, compartment, in the model for which
CHERIoT was designed.

The cat photos were contributed by some wonderful people from the Fediverse.
Starting at the top left, numbered left to right then top to bottom, the photo
credits are:

1, 3, 5, 10:
Photographer: James (@chongliss), Cats: Jiji (1), Luna (3, 5), and
Felix (10).

2, 11:
Photographer: Cassian Lodge (@cassolotl), Cat: Rosa

4:
Photographer: marinbenc (marinbenc@sigmoid.social)

6:
Photographer: Asta Halkjær From (@ahfrom), Cat: Betty Rambo.

7:
Photographer: vitaut (@vitaut)

8:
Photographer: Michael McWilliams (@MichaelMcWilliams)

9:
Photographer: jarkman (@jarkman)

12:
Photographer: Isaac Freund (@ifreund), Cat: Marzipan

Draft cover for the CHERIoT Programmers’ Guide. Contains a picture of an EPROM with the silicon chip visible and a zoomed-in view that shows it full of cats in boxes. The cats are as described in the post contents.

ALT

Jan 10

Jan 10

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

I wrote a little producer-consumer example for the #CHERIoT book. It's smaller and simpler than the one in the RTOS repo.

I ran it and the output was surprising. I realised that we're yielding in futex_wake more often than we strictly needed to.

One scheduler patch later, my example runs 35% faster and the test suite 2% faster. The network stack should also be faster, since the core structure for communication between its threads is not far off my toy example.

The things I do to avoid writing...

Jan 10

Jan 10

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

There are a lot of recent improvements to the #CHERIoT compiler from folks at SCI Semiconductor and Google, along with community contributors. We are working towards tracking upstream LLVM and starting to upstream changes, as well as improving the developer experience. Owen Anderson has written a blog post describing some of the new improvements.

CHERIoT Platform · Jan 10Clang 17 now in dev container, and other toolchain newsThanks to a collaboration with the upstream CHERI toolchain, the CHERIoT toolchain has now been rebased onto Clang 17 from Clang 13, bringing us two years closer to upstream Clang. Major thanks to Alex Richardson (Google) and Sam Leffler (Google) for their work on this effort!

Jan 9

Jan 9

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

Starting the #CHERIoT new year with a post from Phil Day, in which he learns that respecting the principles of least privilege and intentional use don't just improve security, they also make it easier to understand what third-party code is doing.

CHERIoT Platform · Jan 8Using capabilities instead of trusting 3rd party codeWhen getting started with CHERI it’s easy to think of capabilities as just a better form of pointer that the compiler will set up and manage on your behalf. And of course there are a lot of memory safety benefits that come for free from that, but the real value is unlocked when you start to manipulate capabilities directly to capture intent. Let me illustrate with a simple example.

Jan 7

Jan 7

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

I recently posted that we weren't currently hiring, but actually we are on the hardware side. We're still looking to hire some DV folks for our hardware team (full remote, UK easy, elsewhere a bit harder but may be possible).

Anyone interested in working on CHERI hardware, please ping me. Please boost if you have hardware folks in your network!

#GetFediHired #hiring #CHERIoT

Jan 6

Jan 6

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

Looks as if contracts are going to make it into #CPlusPlus. The #CHERIoT programming guides will need updating to ensure that they are never used in our codebase, since (at least as currently specified) they are incompatible with modern software engineering practices.

Jan 2

Jan 2

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

My first books were all published by a US publisher and so were written in American. The #CHERIoT Programmers' Guide is currently written in English but I'm pondering translating it to American before publication. What language should it be written in?

54%English
6%American
40%C++

Dec 21, 2024

Dec 21, 2024

David Chisnall (*Now with 50% more sarcasm!*) @david_chisnall@infosec.exchange

I finally turned off GitHub Copilot yesterday. I’ve been using it for about a year on the ‘free for open-source maintainers’ tier. I was skeptical but didn’t want to dismiss it without a fair trial.

It has cost me more time than it has saved. It lets me type faster, which has been useful when writing tests where I’m testing a variety of permutations of an API to check error handling for all of the conditions.

I can recall three places where it has introduced bugs that took me more time to to debug than the total time saving:

The first was something that initially impressed me. I pasted the prose description of how to communicate with an Ethernet MAC into a comment and then wrote some method prototypes. It autocompleted the bodies. All very plausible looking. Only it managed to flip a bit in the MDIO read and write register commands. MDIO is basically a multiplexing system. You have two device registers exposed, one sets the command (read or write a specific internal register) and the other is the value. It got the read and write the wrong way around, so when I thought I was writing a value, I was actually reading. When I thought I was reading, I was actually seeing the value in the last register I thought I had written. It took two of us over a day to debug this. The fix was simple, but the bug was in the middle of correct-looking code. If I’d manually transcribed the command from the data sheet, I would not have got this wrong because I’d have triple checked it.

Another case it had inverted the condition in an if statement inside an error-handling path. The error handling was a rare case and was asymmetric. Hitting the if case when you wanted the else case was okay but the converse was not. Lots of debugging. I learned from this to read the generated code more carefully, but that increased cognitive load and eliminated most of the benefit. Typing code is not the bottleneck and if I have to think about what I want and then read carefully to check it really is what I want, I am slower.

Most recently, I was writing a simple binary search and insertion-deletion operations for a sorted array. I assumed that this was something that had hundreds of examples in the training data and so would be fine. It had all sorts of corner-case bugs. I eventually gave up fixing them and rewrote the code from scratch.

Last week I did some work on a remote machine where I hadn’t set up Copilot and I felt much more productive. Autocomplete was either correct or not present, so I was spending more time thinking about what to write. I don’t entirely trust this kind of subjective judgement, but it was a data point. Around the same time I wrote some code without clangd set up and that really hurt. It turns out I really rely on AST-aware completion to explore APIs. I had to look up more things in the documentation. Copilot was never good for this because it would just bullshit APIs, so something showing up in autocomplete didn’t mean it was real. This would be improved by using a feedback system to require autocomplete outputs to type check, but then they would take much longer to create (probably at least a 10x increase in LLM compute time) and wouldn’t complete fragments, so I don’t see a good path to being able to do this without tight coupling to the LSP server and possibly not even then.

Yesterday I was writing bits of the CHERIoT Programmers’ Guide and it kept autocompleting text in a different writing style, some of which was obviously plagiarised (when I’m describing precisely how to implement a specific, and not very common, lock type with a futex and the autocomplete is a paragraph of text with a lot of detail, I’m confident you don’t have more than one or two examples of that in the training set). It was distracting and annoying. I wrote much faster after turning it off.

So, after giving it a fair try, I have concluded that it is both a net decrease in productivity and probably an increase in legal liability.

Discussions I am not interested in having:

You are holding it wrong. Using Copilot with this magic config setting / prompt tweak makes it better. At its absolute best, it was a small productivity increase, if it needs more effort to use, that will be offset.
This other LLM is much better. I don’t care. The costs of the bullshitting far outweighed the benefits when it worked, to be better it would have to not bullshit, and that’s not something LLMs can do.
It’s great for boilerplate! No. APIs that require every user to write the same code are broken. Fix them, don’t fill the world with more code using them that will need fixing when the APIs change.
Don’t use LLMs for autocomplete, use them for dialogues about the code. Tried that. It’s worse than a rubber duck, which at least knows to stay silent when it doesn’t know what it’s talking about.

The one place Copilot was vaguely useful was hinting at missing abstractions (if it can autocomplete big chunks then my APIs required too much boilerplate and needed better abstractions). The place I thought it might be useful was spotting inconsistent API names and parameter orders but it was actually very bad at this (presumably because of the way it tokenises identifiers?). With a load of examples with consistent names, it would suggest things that didn't match the convention. After using three APIs that all passed the same parameters in the same order, it would suggest flipping the order for the fourth.

#GitHubCopilot #CHERIoT

Recent searches

Search options

Administered by:

Server stats:

#CHERIoT