I submitted my js1k demo just now. It's called "Xcode - a CoreWars engine" and implements RedCode, a minimal instruction set to play CoreWars. CoreWars is a game from 1984 (when what you can render in a JS1k now wouldn't even render with a seven story super computer then) where two computer programs try to eliminate the other.
I'd better write this now while the code is still fresh :)
This is my fifth JS1k demo! (Previously: 
http://js1k.com/2016-elemental/demo/2491, 
http://js1k.com/2015-hypetrain/demo/2192, 
http://js1k.com/2012-love/demo/1130, 
http://js1k.com/demo/649)
Edit: I published the dev sources and stuff on https://github.com/pvdz/js1kdemo2019 and you can see the full pre-golfed UI at https://pvdz.github.io/js1kdemo2019/js1k.full.htmlCoreWars
The game has a faily simple setup:
You have the game memory, called "the core". The standard core consists of 8000 cells (although the size is arbitrary), each cell being an instruction with two registers. The core wraps around and any address is relative to the position from where it was read.
The programs are called "warriors" and each warrior has no idea where it is or where the other warrior is, except that it's at least some cells away from itself, usually 100. Warriors are compiled in the core and have a max size, usually 100 cells. Core cells not occupied by a warrior is initialized to a 
DAT 0 0.
I implemented the "ICWS'88" specification (international core wars spec) because it did not have the register modifiers which made it simpler to implement for JS1k. It also doesn't have the personal space stuff. There are 11 actual instructions and two compile-time instructions (
EQU and 
END).
Addressing
There are four addressing modes: 
- immediate, prefixed with 
# (consider this an absolute value or literal, often modifying the behavior of an instruction)
- direct, no prefix
- indirect, prefixed with 
@ (read the B address at given address)
- indirect decrement, prefixed with 
< (same as indirect except it decrements first)
I found the addressing modes a little confusing and to be honest it's still a little hard to read them, especially without labels.
Instruction set
The instruction set is:
- DAT: specced to terminate the current process. It's usually the goal to get the other program to execute one of these. Although sometimes a warrior wants to reduce its own process queue.
- ADD/SUB: add values to registers
- CMP: skip one instruction if certain operands are equal
- DJN: basically CMP but decrements first
- SLT: CMP if A
- JMP: jumps to given address unconditionally
- JMN/JMZ: jumps to given address if B is non-zero/zero
- MOV: copy one value or whole cell to another cell
- SPL: create a new process for the current warrior which starts at a given cell
So really there's just five instruction groups: kill (DAT), math (ADD/SUB), branching (CMP, DJN, SLT, JMP, JMN, JMZ), copying (MOV), and multi-tasking (SPL).
Beyond that, the game ends in a draw if both warriors are still alive after some time, standard after 80.000 ticks.
Warriors
While the instruction set leaves a somewhat limited number of warriors to be built, the addressing modes make for crazy complex things to happen. One of the first warriors is called "the imp" and it's actually quite a simple program:
MOV 0 1
It's a replicating warrior that copies itself to the next cell. Ad infinitum. It's not going to win you many games, unless the opponent is suicidal. But it's notoriously difficult to kill as you have to write a 
DAT to the next position of the opponent immediately after its turn. That's quite sophisticated for what you can work with.
The second one is "the dwarf":
ADD #4 3
MOV 2, @2
JMP -2
DAT #0, #0
This warrior simply "bombs" the core by copying a 
DAT instruction to every third cell in the core, in hopes of randomly destroying the opponent.
My own JS1k demo warrior is somewhat of a dwarf:
MOV 5 <5
SUB #5 4
ADD #6 4
ADD 3 2
JMP -4
DAT 0 2
It bombs the core at an quadratic interval, which made for a slightly more interesting demo visally than a regular dwarf :)
The demo
Anyways. On to my demo. The original demo has been drasticaly cut down. Some features, like a full compiler and share-by-url-hash, had to be abandoned. Let's go through some of the significant features.
Note that I used 
RegPack to trim the last 200 bytes off. This tool is great but has some limits and drawbacks.
The core is encoded as an array of arrays. Each sub-array is a quintuple of 
[opcode, A modifier, A value, B modifier, B value]. The opcode is upper case wysiwyg, the values are numbers. This makes complication rather easy as we can almost split the input into the core. Almost.
Let's take a look at the demo step-by-step;
b.innerHTML=`<textarea rows=5 id=W>MOV 0 1</textarea><textarea rows=5 id=Y>MOV 5 <5\nSUB #5 4\nADD #6 4\nADD 3 2\nJMP -4\nDAT 0 2</textarea><pre id=P></pre>`
Textareas make for each monospaced multi-line input fields. Pre is the only 
white-space: pre element, luckily it has a short name :)
The id does not need to be quoted. The textarea supports a 
rows attribute which makes it rather cheap to force a height since doing it through css is quite byte intensive.
This line is actually sub-optimal because it was originall a tagged template with newlines. Unfortunately 
regpack does (did) not support that and while 
\u2028 type newlines did work, they did not end up nicely visually. Plus, regpack would take the newlines anyways and turn them into one character, making this a minimal loss.
The initial demo warriors actually turned out to be quite expensive and it took me a while to land on two demo warriors that would yield a somewhat interesting output without being too excessive in size. I think displaying the Imp is nice since it's a pretty classic warrior to whole thing, and then a pseudo random dwarf.
v=A=>(A+8e9)%8e3
Basically an addressing normalization function, making sure the final address is always between zero and 
7999. I do cheat a little bit by simply adding 
8e9 rather than explicitly adding 8000 until the value is positive. I think this works ok. 
Note that 
8e9 is short for 
8.000.000.000.
y=A=>c[v(A)]
Read a cell, normalize the address. Unfortunately this function cannot be used to write to cells.
x=(A,B,C)=>B=='#'?C:A+C+(B?y(A+C)[4]=v(y(A+C)[4]-(B=='<')):0)
Resolve an address depending on its addressing mode.
If the modifier is a 
#, return the value as is (
C). 
If there is no modifier then return the value relative to the current address (
A+C). 
If the modifier is 
@ then return the B register of the address at the value relative to the current address (
cell[A+B][4]).
If the modifier is 
< then do the same thing as for 
@, except also decrement and update the B (
--cell[A+B][4]).
One optimization was to do 
y(A+C)[4]=v(y(A+C)[4]-(B=='<'), which updates the B of the indirect address but which will do 
-= 0> if the modifier is 
@, effectively not really updating B at all. In addition, regpack will drop the repeated occurrence of doing 
y(A+C)[4] so that's not a big deal.
;u=(A,b,q=p=0)=>A.value.replace(/^(...)( +([#@<]?) *(.+?)( +([#@<]?) *(.+?))?)?( *;.*)?$/gm,(m,A,a,B,C,d,D,E)=>q||A&&(q=A=='END')?p=C:c[v(b++)]=[A,B,C|0,D,E|0])|p
Ah, the compiler. The original regex clocked in at 317 bytes. While it was able to do a lot more, like aliases and comments and all that, it was simply too big. This is 59 bytes for the basic functionality, and even comments!
I made non-capturing groups (
(?: ...)) capturing and added the dud params to the callback. Each capturing group is guaranteed to hit something so I don't have to worry about `undefined` slots. the 
|0 is an old trick to force a string to number.
If the op is 
END, its operand will be logged as the starting offset. This will be a string at assignment time. However, this "compile" function also returns that number and it does so with a biary "or", which will also force the value into a number. Note that the replace will never return a value that screws up that offset.
_=DAT=A=>h=-1
add=A=>m[4]=v(k?m[4]+i*A:(m[2]=v(l[2]+m[2]*A),l[4]+m[4]*A))
ADD=A=>add(1)
SUB=A=>add(-1)
CMP=A=>h+=k?i==m[4]:l+''==m
DJN=A=>h=--(D=='#'?y(h):m)[4]?A:h
JMP=A=>h=A
JMN=A=>h=m[4]?A:h
JMZ=A=>h=m[4]?h:A
MOV=A=>k?m[4]=i:c[v(j)]=[...l]
SLT=A=>h+=(k?i:l[4])<m[4]
SPL=A=>h=g.push(v(++h))&&A
Clearly these are the ops. They are consolidated and folded as much as possible. But what should also be quite apparent here is that effort was made to make them look the same. This same pattern of 
=A=>h= repeats itself multiple times in order to try and persuade regpack to compress them better.
So the warrior is a global and just a number (current pc). If it ends up as a negative then it's not put back in the queue. Otherwise it is incremented and pushed back.
DAT clearly kills the current process by setting it to -1.
ADD/SUB use a generalized abstraction to share the code. I could not get this in such a shape that works nicely with regpack so I went for this instead. At least the ADD/SUB funcs are very similar.
CMP abuses coercion to conditionally increment the warrior by one (1+true = 1+1, 1+false = 1+0). It also abuses coercion of arrays, doing 
a+''==b to trigger the two arrays to be joined and comparable as strings. This works since each cell is an array and doing 
a+'' causes 
a.join(','). The weak comparison then triggers the same for the rhs, resulting in a plain string comparison of two arrays.
DJN abuses the fact that while a group doesn't return a reference, accessing a property on some value is in fact a reference, so we can decrement that on a ternary. It then either returns itself (no jump) or the operand.
MOV uses spread to make sure two cells do not share the same instance. It's still shorter than slicing or anything else.
SPL was annoying. Since it puts the current warrior back on the stack, incremented, I had to make sure to put it on the stack manually before replacing the current process with its clone such that the normal end would put it back on the stack. No big deal but a little tricky.
The instructions are defined as globals because they are 
eval-ed. I started out with an immediately invoked oject literal but the eval strat was simply shorter. Unfortunately some of the ops have active side effects (notably decrementing and other assignments) so they have to be functions to prevet arbitrary side effects.
(P.onclick=A=>h=(n=a=h=A=>{
The core pre element gets an onclick handler to restart and this function is invoked immediately.
g=o%2?d:f
Cache the current queue of processes. I tried hard to eliminate this one but it's just too hard.
h=g.shift()
Get the next process of the current process queue.
[A,B,C,D,E]=y(h)
Abusing destructuring to quickly decompose the instruction and registers. This has seen a few iterations, including using an object instead (this makes accessing individual parts cheaper) or not destructuring at all. But in the end this was the most efficient.
i=x(h,B,C)
j=x(h,D,E)
Resolve the address of A and B operand, even if not actually used. The op is evalled on a global level and so these will be available.
k=B=='#'
l=y(i)
m=y(j)
Just caching some values. The address always needs to be normalized but immediate values would not need that, in particular negative literals.
Caching the hash comparison is just saving a byte. I don't think it should but it does. Such is the game of roulette.
eval(A)(v(i-1))
Execute the opcode and pass it on the A-address minus one. The minus one is because if it is used then it's set to the current warrior PC, which is incremented afterwards, regardless. So to counter that increment, the A is decremented first. Looking at it now I'm not sure whether there's a bug in one of hte opcodes regarding this decrement, but meh.
h>=0&&SPL()
If the process is non-negative, put it back in the current process queue. Actually it re-uses the SPL instruction here, which pretty much achieves the same thing since this is the last time in the main loop where the current process is accessed.
(g.length?--o?(n=setTimeout(a,1))&&[o,d.length,f.length]:'Time':o%2?'B!':'A!')
+c.map(([[A]],i)=>
(i%150?'':'\n')
// +(d.includes(i)?`<b style='background-color:#109'>`:f.includes(i)?`<b style='background-color:#900'>`:'')+A[0][0]+'</b>'
+(d.includes(i)?`<b style='background-color:#109'>`:f.includes(i)?`<b style='background-color:#900'>`:'')+A+'</b>'
).join('')
Gunk, the output. First check whether the current warrior still has processes left. If so queue another timeout, if not print a message. Also assert the time limit has not been reached.
For the coe we simply print the first character of each instruction. I special cased the underscore as the default value. A bit misleading but it just looks better this way. As a visual queue it will check whether the current index is contained in either instruction queue. If so it will compile that cell to a color. This effect is stronger for splitting warriors but unfortunately those are also quite big and/or super bad.
(
clearTimeout(n)
,c=Array(8e3).fill(o=8e4).map(A=>h=['_','#',0,'#',0])
,f=[u(Y,e=100+~~(Math.random()*7800))+e]
,d=[u(W,0)]
)
This is the initialization that happens each time the game re-/starts. It clears the previous game timer. It fills the core. 
Had to 
map because 
fill does not clone the array. So I just used the 
fill param to save a byte. The 
map arrow has the same look as many other arrows and as a result the overhead is not as bad as it looks.
The last two steps initialize the warriors, one starts at zero (always) and the other one at randomly 7800 to make sure there's always a 100 cell buffer left and right. The compile function return the start offset (nonzero in case 
END was found) and this is the initial process of the queue for each warrior.
And that's it :) Just like JS1k :( But we had a good time :D