👇🏿 📓 👽 AI and 2048. Part 2: Minimax + alpha beta clipping 😷 💣 😼

We examined the Monte Carlo method , today we will see how the computer mind plays in 2048 using the good old minimax with alpha-beta clipping.

This article was written with the support of EDISON Software, a company that develops mobile applications and provides software testing services .

Solution spied on by user stackoverflow

ovolve , who noted in the discussion how to teach AI the game 2048 .

Comment translation from ovolve

I am the author of the program mentioned in this thread. You can see the AI in action or see the code .

Currently, the program wins in about 90% of cases by executing java-scripts in a browser on my laptop, spending 100 milliseconds to think about the course, working, although not perfectly, but quite well.

Since the game is a discrete state space with complete information, in fact being a turn-based game like chess and checkers, I used the same methods that showed their performance in these games, namely minimax search with alpha-beta clipping . Since the links provide a lot of information about this algorithm, I’ll just talk about the two main heuristics that I used in the static estimation function and formalizing many of the intuitive assumptions made by other people here.

Monotone

This heuristic tries to ensure that all tile values either increase or decrease both left / right and up / down. This heuristic alone reflects the conjecture that many others have mentioned that more valuable tiles should be grouped in a corner. This, as a rule, prevents the accumulation of less valuable tiles and keeps the board organized as smaller tiles cascade into larger ones.

Here is a screenshot of a completely monotonous grid. I got this situation by running an algorithm with the installed eval function in order to ignore other heuristics and take into account only monotony.

Smoothness (smoothness, evenness)

The above heuristic in itself tends to create structures in which neighboring cells are reduced in value, however, of course, neighbors should have the same meaning to unite. Therefore, the heuristic of smoothness simply measures the difference in values between adjacent tiles, trying to minimize their number.

A commentator at Hacker News provided an interesting formalization of this idea in terms of graph theory.

Translation of formalization with Hacker News

Yesterday I showed this game to a colleague, a lover of graph theory, and we also decided to think about how to solve this game using AI.

The simplest solution is minimax, which, as I see it, is implemented pretty well. If someone here is not familiar with minimax, OP wrote very elegant and well-commented code that would be a great tutorial.

The less computationally intensive approach we proposed was to simulate the game state in the form of a graph G (V, E) , where V is a set of active tiles and E is a set of edges connecting adjacent tiles weighted by function c (v1, v2) , which returns the absolute value of the difference between the two tiles. For each solution, the AI chooses a move that minimizes the sum of the weights of all the edges in the new game state.

The reason for this is that the only way to make progress in the game is to have tiles with the same values next to each other, for which the weight in G will be 0. Thus, the AI should try to minimize the total weight. In the end, there will be a large number on the boards with a large weight of edges to adjacent tiles, so the AI will try to keep these tiles next to other large tiles to minimize the difference.

Since the game is stochastic, the approach I described may not work in the worst case, but it can also be applied to the existing minimax solution as a weight function for each node in the tree.

Here is a screenshot of a perfectly smooth mesh, kindly provided by this excellent mock fork . (link to the web archive, while Java scripts on the page work and you can use the keyboard to make a move in any direction - note by the translator).

Loose tiles

And finally, there is a penalty for having too few free tiles, since the options can quickly end when the playing field becomes too cramped.

And it's all! Searching the game space while optimizing these criteria gives surprisingly good performance. One of the benefits of using a generic approach like this rather than an explicitly encoded move strategy is that the algorithm can often find interesting and unexpected solutions. If you observe his progress, he often makes amazing but effective movements, such as the sudden change of walls or corners, near which he builds his game.

Small change

The screenshot demonstrates the power of this approach. I canceled the tile limit (so that it continues to grow after reaching 2048), and here is the best result after eight tests.

Yes, this is 4096 along with 2048. =) This means that he reached the elusive 2048 tile on one board.

Java-Script code for minimax with alpha-beta clipping and a static evaluation function from the stackoverflow user ovolve is given below in the article.

The minimax method is devoted to several excellent habr-articles, so we omit the academic detailed explanation of what it consists of. For those who ~~joined the IT community, I recently~~ heard the beautiful terms “minimax” and “alpha-beta clipping”, but don’t know what this means, let’s try, literally in a couple of paragraphs, to explain the most general meaning.

Minimax

In some games, the process of a game between two players (who make a move in turn) can be represented as a so-called tree of options. In each specific position, each player usually has a choice between different options for his move. And in response to each of these options, an opponent can also be like in many ways.

^{Option tree fragment}

Since at any moment of the game there is complete information about the state of the playing field, the current state of the position can always be accurately estimated. Such a function is called a static evaluation function or abbreviated SFO . Moreover, the more important this function is when evaluating a specific position, the more advantageous is the position for one player (let's call it the maximizing player ). The smaller the numerical value of this function when evaluating a position, the more advantageous is the position for the second player (let's call it the minimizing player ).

After each move, the position changes, and so its score changes. When considering the tree of options, each player needs to not just prefer those branches in which the rating is most favorable to him. You should also avoid those branches in which the evaluation of the position is favorable for the opponent.

It is assumed that the opponent is also guided by rationalism and also avoids options that could lead him to lose. That is, each player, when choosing an option, proceeds from maximizing his own benefit and at the same time minimizing the opponent’s profit.

This is minimax.

Alpha beta clipping

It is quite obvious: who calculates a tree from a given position to a greater depth, he has more chances to win. But there is one nuisance - the tree of options in games has a nasty habit of branching and growing exponentially with each level of nesting. The counting abilities of programs and even more so people are limited; counting “right up to the mat” is far from always possible. It can easily turn out that the player has counted to a position where he has a good assessment of the playing field, but literally at the next (unreadable) level the opponent has the opportunity to make such a move that radically changes the position estimate to the opposite.

Who is to blame and what to do? The computational complexity is to blame for the complete tree traversal; it is proposed to fight by cutting off unnecessary branches. If the player evaluating the position sees that there is some branch of the options tree:

or less profitable for it than other branches that have already been analyzed,
or more beneficial to the opponent than other branches that have already been analyzed,

then the player discards this branch, does not waste time and resources on considering sub-options from this obviously worse branch for him.

This allows you to allocate more computational resources for calculating more favorable branches for a greater rendering depth in the options tree. In the process of evaluating the playing field at different levels of the options tree, the player operates with two dynamically changing coefficients - alpha (the value of the SFD that is minimally encountered in the branch - i.e. more favorable for the minimizing player) and beta (the value of the SFO that is most encountered in the branch - i.e. more favorable for the maximizing player). At each level, comparing the SFD of the current position with alpha and beta coefficients allows you to sweep (without calculating them completely) branches that are less beneficial for the player evaluating the position and / or more beneficial for his opponent.

This is alpha beta clipping.

Minimalax recursive function with alpha beta clipping

2048 with AI is implemented as an Excel application with VBA macros, this is how the minimax algorithm with alpha beta clipping looks like a despicable visual basic.

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''( - )''''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '       -- 'Position -  4  4     'Depth - ,    'Alpha, Beta -         'MaximisingPlayer -      ? Private Function MiniMaxAlpaBeta_Evaluation(Position As Variant, Depth As Long, _ Alpha As Double, Beta As Double, _ MaximisingPlayer As Boolean, _ Optional MainLevel As Boolean = False) As Double Dim MaxEval As Double '  Dim MinEval As Double '  Dim PositionNext As Variant '     Dim PositionTemp As Variant '     Dim Eval As Double '   Dim Way As Long '   -      Dim Row As Long '     Dim Col As Long '     Dim TileNew As Long '      '   (  ,  '     ) If GameOverPosition(Position) Then '    ? '     MiniMaxAlpaBeta_Evaluation = -1000000 + TileMax(Position) '         ElseIf Depth = 0 Then '     MiniMaxAlpaBeta_Evaluation = StaticEvaluation(Position) '  ,    '     () ElseIf MaximisingPlayer Then MaxEval = -1000000 '      For Way = 1 To 4 ' 4   - (, , , ) ChangeCount = 0 ' ,      ',       PositionNext = StepHuman(Position, Way) If ChangeCount > 0 Then '     '      , '    () Eval = MiniMaxAlpaBeta_Evaluation(PositionNext, Depth - 1, _ Alpha, Beta, False) If Eval > MaxEval Then MaxEval = Eval '  '     If Eval > Alpha Then Alpha = Eval '    ,   '   -    If Beta > Alpha Then Exit For End If Next '          MiniMaxAlpaBeta_Evaluation = MaxEval '  ,    '     () Else 'Not MaximisingPlayer MinEval = 1000000 '      For Row = 1 To 4 '     For Col = 1 To 4 '     If Position(Row, Col) = 0 Then '   For TileNew = 2 To 4 Step 2 '    2  4 ',       '    PositionNext = StepComp(Position, Row, Col, TileNew) '     , '    () Eval = MiniMaxAlpaBeta_Evaluation(PositionNext, Depth - 1, _ Alpha, Beta, True) If Eval < MinEval Then MinEval = Eval '  '     If Eval < Beta Then Beta = Eval '    ,   '   -    If Alpha < Beta Then Exit For Next End If Next Next '          MiniMaxAlpaBeta_Evaluation = MinEval End If End Function

Ovolve code in java-script

 function AI(grid) { this.grid = grid; } //   () AI.prototype.eval = function() { var emptyCells = this.grid.availableCells().length; var smoothWeight = 0.1, //monoWeight = 0.0, //islandWeight = 0.0, mono2Weight = 1.0, emptyWeight = 2.7, maxWeight = 1.0; return this.grid.smoothness() * smoothWeight //+ this.grid.monotonicity() * monoWeight //- this.grid.islands() * islandWeight + this.grid.monotonicity2() * mono2Weight + Math.log(emptyCells) * emptyWeight + this.grid.maxValue() * maxWeight; }; // alpha-beta depth first search AI.prototype.search = function(depth, alpha, beta, positions, cutoffs) { var bestScore; var bestMove = -1; var result; // the maxing player if (this.grid.playerTurn) { bestScore = alpha; for (var direction in [0, 1, 2, 3]) { var newGrid = this.grid.clone(); if (newGrid.move(direction).moved) { positions++; if (newGrid.isWin()) { return { move: direction, score: 10000, positions: positions, cutoffs: cutoffs }; } var newAI = new AI(newGrid); if (depth == 0) { result = { move: direction, score: newAI.eval() }; } else { result = newAI.search(depth-1, bestScore, beta, positions, cutoffs); if (result.score > 9900) { // win result.score--; // to slightly penalize higher depth from win } positions = result.positions; cutoffs = result.cutoffs; } if (result.score > bestScore) { bestScore = result.score; bestMove = direction; } if (bestScore > beta) { cutoffs++ return { move: bestMove, score: beta, positions: positions, cutoffs: cutoffs }; } } } } else { // computer's turn, we'll do heavy pruning to keep the branching factor low bestScore = beta; // try a 2 and 4 in each cell and measure how annoying it is // with metrics from eval var candidates = []; var cells = this.grid.availableCells(); var scores = { 2: [], 4: [] }; for (var value in scores) { for (var i in cells) { scores[value].push(null); var cell = cells[i]; var tile = new Tile(cell, parseInt(value, 10)); this.grid.insertTile(tile); scores[value][i] = -this.grid.smoothness() + this.grid.islands(); this.grid.removeTile(cell); } } // now just pick out the most annoying moves var maxScore = Math.max(Math.max.apply(null, scores[2]), Math.max.apply(null, scores[4])); for (var value in scores) { // 2 and 4 for (var i=0; i<scores[value].length; i++) { if (scores[value][i] == maxScore) { candidates.push( { position: cells[i], value: parseInt(value, 10) } ); } } } // search on each candidate for (var i=0; i<candidates.length; i++) { var position = candidates[i].position; var value = candidates[i].value; var newGrid = this.grid.clone(); var tile = new Tile(position, value); newGrid.insertTile(tile); newGrid.playerTurn = true; positions++; newAI = new AI(newGrid); result = newAI.search(depth, alpha, bestScore, positions, cutoffs); positions = result.positions; cutoffs = result.cutoffs; if (result.score < bestScore) { bestScore = result.score; } if (bestScore < alpha) { cutoffs++; return { move: null, score: alpha, positions: positions, cutoffs: cutoffs }; } } } return { move: bestMove, score: bestScore, positions: positions, cutoffs: cutoffs }; } // performs a search and returns the best move AI.prototype.getBest = function() { return this.iterativeDeep(); } // performs iterative deepening over the alpha-beta search AI.prototype.iterativeDeep = function() { var start = (new Date()).getTime(); var depth = 0; var best; do { var newBest = this.search(depth, -10000, 10000, 0 ,0); if (newBest.move == -1) { break; } else { best = newBest; } depth++; } while ( (new Date()).getTime() - start < minSearchTime); return best } AI.prototype.translate = function(move) { return { 0: 'up', 1: 'right', 2: 'down', 3: 'left' }[move]; }

Static Evaluation Function

Since at each level in the options tree you have to evaluate the playing field (in order to decide for which of the players, the estimated position is actually more advantageous), you need to decide by what criteria to distinguish a good position from a bad one.

We assume that the maximizing player is the person (or AI) who decides which of the 4 directions (up, left, right, down) to move all the tiles in. A minimizing player is that insidious subroutine that randomly generates 2 or 4 in the most inappropriate places.

SFO is compiled from the perspective of a maximizing player. The higher the SFD rating for the playing field, the better the position for the “maximalist”. The lower - the more pleasant the position on the board for the "minimalist".

In the case of 2048 - what factors are considered favorable for the one who moves the tiles?

Monotone

Firstly, it is desirable that the tiles are arranged in ascending / descending order in some directions. If you don’t do this, then when generating new tiles, the playing field will quickly become clogged with randomly arranged tiles of different sizes, which cannot immediately be connected normally with each other.

In the Siberian Federal District, you need to look in all 4 directions (top-down, left-to-right, right-to-left, bottom-up) and calculate where the tiles are a decreasing or increasing progression. If in progression there are tiles that do not fit into the general series, then this reduces the numerical coefficient of monotony. Then, from the 4 coefficients for all directions, the best one is selected, which is taken into account in the total value of the Siberian Federal District.

Smoothness

Moreover, it would be more preferable if the progression from standing in a row of tiles was not just increasing, but non-decreasing (or instead of decreasing row, it is preferable to non-increasing), that is, it is good when the same tiles are nearby, which allows them to collapse into one, gaining points and increasing free space on the playing field.

Therefore, the Siberian Federal District is looking for identical tiles next to it on the playing field and takes into account the number of such pairs in a special coefficient.

Empty cells

Obviously, the more free space, the more room for maneuver and the less likely to lose quickly.

SFO considers empty cells on the field and the more of these, the position is considered more profitable for the maximizing player.

Maximum tile

Since the main thing in this game is to get a large tile on the field, the more the better - 2048, 4096, 8192 (or whatever you have the strength and patience for), the options where the maximum tile value is more should be considered as the most profitable SFD.

Siberian Federal District for 2048

Implementation of Siberian Federal District as a VBA macro

 ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''  '''''''''''''''''''''''' ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '     'Position -  4  4     Private Function StaticEvaluation(Position As Variant) As Double Dim Smoothness As Double ' Dim Monotonicity As Double ' Dim EmptyCount As Double '  Dim MaxValue As Long '  '   Const SmoothWeight = 0.1 Const MonoWeight = 1 Const EmptyWeight = 2.7 Const MaxWeight = 1 Dim k As Long '   Dim i As Long '  Dim j As Long '  Dim x As Long '  Dim y As Long '  ' Dim Value As Double '       '         Dim TargetValue As Double Smoothness = 0 '    For i = 1 To 4 '     For j = 1 To 4 '     If Position(i, j) > 0 Then '   Value = Log(Position(i, j)) / Log(2) If i < 4 Then '       For x = i + 1 To 4 '    If Position(x, j) > 0 Then '    '    TargetValue = Log(Position(x, j)) / Log(2) ',   Smoothness = Abs(Value - TargetValue) '       Exit For End If Next End If If j < 4 Then '       For y = j + 1 To 4 '    If Position(i, y) > 0 Then '    '    TargetValue = Log(Position(i, y)) / Log(2) ',   Smoothness = Abs(Value - TargetValue) '        Exit For End If Next End If End If Next Next ' Dim arrTotals(1 To 4) As Double '     Dim Current As Long '   Dim Next_ As Long '      Dim CurrentValue As Double '      Dim NextValue As Double '        Monotonicity = 0 '    '      For k = 1 To 4 arrTotals(k) = 0 Next ' -  - For x = 1 To 4 '   Current = 1 '      '   (     )  Next_ = Current + 1 Do While Next_ <= 4 '       '      '(       ) Do While Next_ <= 4 '       If Position(x, Next_) = 0 Then '     Next_ = Next_ + 1 '   Else ' -    Exit Do ' ,  ,   End If Loop '         If Next_ > 4 Then Next_ = 4 '          If Position(x, Current) > 0 Then CurrentValue = Log(Position(x, Current)) / Log(2) Else CurrentValue = 0 End If ' MsgBox "Position[" & x & ", " & Next_ & "]=" & Position(x, Next_) If Position(x, Next_) > 0 Then NextValue = Log(Position(x, Next_)) / Log(2) Else NextValue = 0 End If If CurrentValue > NextValue Then '   ? arrTotals(Up) = arrTotals(Up) + NextValue - CurrentValue ElseIf NextValue > CurrentValue Then '   ? arrTotals(Down) = arrTotals(Down) + CurrentValue - NextValue End If Current = Next_ '       Next_ = Next_ + 1 '    Loop Next '      -  - Monotonicity = IIf(arrTotals(Up) >= arrTotals(Down), _ Monotonicity + arrTotals(Up), _ Monotonicity + arrTotals(Down)) ' -  - For y = 1 To 4 '   Current = 1 '      '   (     )  Next_ = Current + 1 Do While Next_ <= 4 '       '      '(       ) Do While Next_ <= 4 '       If Position(Next_, y) = 0 Then '     Next_ = Next_ + 1 '   Else ' -    Exit Do ' ,  ,   End If Loop '         If Next_ > 4 Then Next_ = 4 '          If Position(Current, y) > 0 Then CurrentValue = Log(Position(Current, y)) / Log(2) Else CurrentValue = 0 End If If Position(Next_, y) > 0 Then NextValue = Log(Position(Next_, y)) / Log(2) Else NextValue = 0 End If If CurrentValue > NextValue Then '   ? arrTotals(Left) = arrTotals(Left) + NextValue - CurrentValue ElseIf NextValue > CurrentValue Then '   ? arrTotals(Right) = arrTotals(Right) + CurrentValue - NextValue End If Current = Next_ '       Next_ = Next_ + 1 '    Loop Next '      -  - Monotonicity = IIf(arrTotals(Left) >= arrTotals(Right), _ Monotonicity + arrTotals(Left), _ Monotonicity + arrTotals(Right)) '     EmptyCount = 0 '      MaxValue = 0 '    For i = 1 To 4 '     For j = 1 To 4 '     If Position(i, j) = 0 Then '  ... '...     EmptyCount = EmptyCount + 1 '     ... ElseIf Position(i, j) > MaxValue Then MaxValue = Position(i, j) '...    End If Next Next '   StaticEvaluation = Smoothness * SmoothWeight + _ Monotonicity * MonoWeight + _ Log_Base_Arg(EmptyCount) * EmptyWeight + _ MaxValue * MaxWeight End Function

Ovolve code in java-script

 function Grid(size) { this.size = size; this.startTiles = 2; this.cells = []; this.build(); this.playerTurn = true; } // pre-allocate these objects (for speed) Grid.prototype.indexes = []; for (var x=0; x<4; x++) { Grid.prototype.indexes.push([]); for (var y=0; y<4; y++) { Grid.prototype.indexes[x].push( {x:x, y:y} ); } } // Build a grid of the specified size Grid.prototype.build = function () { for (var x = 0; x < this.size; x++) { var row = this.cells[x] = []; for (var y = 0; y < this.size; y++) { row.push(null); } } }; // Find the first available random position Grid.prototype.randomAvailableCell = function () { var cells = this.availableCells(); if (cells.length) { return cells[Math.floor(Math.random() * cells.length)]; } }; Grid.prototype.availableCells = function () { var cells = []; var self = this; this.eachCell(function (x, y, tile) { if (!tile) { //cells.push(self.indexes[x][y]); cells.push( {x:x, y:y} ); } }); return cells; }; // Call callback for every cell Grid.prototype.eachCell = function (callback) { for (var x = 0; x < this.size; x++) { for (var y = 0; y < this.size; y++) { callback(x, y, this.cells[x][y]); } } }; // Check if there are any cells available Grid.prototype.cellsAvailable = function () { return !!this.availableCells().length; }; // Check if the specified cell is taken Grid.prototype.cellAvailable = function (cell) { return !this.cellOccupied(cell); }; Grid.prototype.cellOccupied = function (cell) { return !!this.cellContent(cell); }; Grid.prototype.cellContent = function (cell) { if (this.withinBounds(cell)) { return this.cells[cell.x][cell.y]; } else { return null; } }; // Inserts a tile at its position Grid.prototype.insertTile = function (tile) { this.cells[tile.x][tile.y] = tile; }; Grid.prototype.removeTile = function (tile) { this.cells[tile.x][tile.y] = null; }; Grid.prototype.withinBounds = function (position) { return position.x >= 0 && position.x < this.size && position.y >= 0 && position.y < this.size; }; Grid.prototype.clone = function() { newGrid = new Grid(this.size); newGrid.playerTurn = this.playerTurn; for (var x = 0; x < this.size; x++) { for (var y = 0; y < this.size; y++) { if (this.cells[x][y]) { newGrid.insertTile(this.cells[x][y].clone()); } } } return newGrid; }; // Set up the initial tiles to start the game with Grid.prototype.addStartTiles = function () { for (var i=0; i<this.startTiles; i++) { this.addRandomTile(); } }; // Adds a tile in a random position Grid.prototype.addRandomTile = function () { if (this.cellsAvailable()) { var value = Math.random() < 0.9 ? 2 : 4; //var value = Math.random() < 0.9 ? 256 : 512; var tile = new Tile(this.randomAvailableCell(), value); this.insertTile(tile); } }; // Save all tile positions and remove merger info Grid.prototype.prepareTiles = function () { this.eachCell(function (x, y, tile) { if (tile) { tile.mergedFrom = null; tile.savePosition(); } }); }; // Move a tile and its representation Grid.prototype.moveTile = function (tile, cell) { this.cells[tile.x][tile.y] = null; this.cells[cell.x][cell.y] = tile; tile.updatePosition(cell); }; Grid.prototype.vectors = { 0: { x: 0, y: -1 }, // up 1: { x: 1, y: 0 }, // right 2: { x: 0, y: 1 }, // down 3: { x: -1, y: 0 } // left } // Get the vector representing the chosen direction Grid.prototype.getVector = function (direction) { // Vectors representing tile movement return this.vectors[direction]; }; // Move tiles on the grid in the specified direction // returns true if move was successful Grid.prototype.move = function (direction) { // 0: up, 1: right, 2:down, 3: left var self = this; var cell, tile; var vector = this.getVector(direction); var traversals = this.buildTraversals(vector); var moved = false; var score = 0; var won = false; // Save the current tile positions and remove merger information this.prepareTiles(); // Traverse the grid in the right direction and move tiles traversals.x.forEach(function (x) { traversals.y.forEach(function (y) { cell = self.indexes[x][y]; tile = self.cellContent(cell); if (tile) { //if (debug) { //console.log('tile @', x, y); //} var positions = self.findFarthestPosition(cell, vector); var next = self.cellContent(positions.next); // Only one merger per row traversal? if (next && next.value === tile.value && !next.mergedFrom) { var merged = new Tile(positions.next, tile.value * 2); merged.mergedFrom = [tile, next]; self.insertTile(merged); self.removeTile(tile); // Converge the two tiles' positions tile.updatePosition(positions.next); // Update the score score += merged.value; // The mighty 2048 tile if (merged.value === 2048) { won = true; } } else { //if (debug) { //console.log(cell); //console.log(tile); //} self.moveTile(tile, positions.farthest); } if (!self.positionsEqual(cell, tile)) { self.playerTurn = false; //console.log('setting player turn to ', self.playerTurn); moved = true; // The tile moved from its original cell! } } }); }); //console.log('returning, playerturn is', self.playerTurn); //if (!moved) { //console.log('cell', cell); //console.log('tile', tile); //console.log('direction', direction); //console.log(this.toString()); //} return {moved: moved, score: score, won: won}; }; Grid.prototype.computerMove = function() { this.addRandomTile(); this.playerTurn = true; } // Build a list of positions to traverse in the right order Grid.prototype.buildTraversals = function (vector) { var traversals = { x: [], y: [] }; for (var pos = 0; pos < this.size; pos++) { traversals.x.push(pos); traversals.y.push(pos); } // Always traverse from the farthest cell in the chosen direction if (vector.x === 1) traversals.x = traversals.x.reverse(); if (vector.y === 1) traversals.y = traversals.y.reverse(); return traversals; }; Grid.prototype.findFarthestPosition = function (cell, vector) { var previous; // Progress towards the vector direction until an obstacle is found do { previous = cell; cell = { x: previous.x + vector.x, y: previous.y + vector.y }; } while (this.withinBounds(cell) && this.cellAvailable(cell)); return { farthest: previous, next: cell // Used to check if a merge is required }; }; Grid.prototype.movesAvailable = function () { return this.cellsAvailable() || this.tileMatchesAvailable(); }; // Check for available matches between tiles (more expensive check) // returns the number of matches Grid.prototype.tileMatchesAvailable = function () { var self = this; //var matches = 0; var tile; for (var x = 0; x < this.size; x++) { for (var y = 0; y < this.size; y++) { tile = this.cellContent({ x: x, y: y }); if (tile) { for (var direction = 0; direction < 4; direction++) { var vector = self.getVector(direction); var cell = { x: x + vector.x, y: y + vector.y }; var other = self.cellContent(cell); if (other && other.value === tile.value) { return true; //matches++; // These two tiles can be merged } } } } } //console.log(matches); return false; //matches; }; Grid.prototype.positionsEqual = function (first, second) { return first.x === second.x && first.y === second.y; }; Grid.prototype.toString = function() { string = ''; for (var i=0; i<4; i++) { for (var j=0; j<4; j++) { if (this.cells[j][i]) { string += this.cells[j][i].value + ' '; } else { string += '_ '; } } string += '\n'; } return string; } // counts the number of isolated groups. Grid.prototype.islands = function() { var self = this; var mark = function(x, y, value) { if (x >= 0 && x <= 3 && y >= 0 && y <= 3 && self.cells[x][y] && self.cells[x][y].value == value && !self.cells[x][y].marked ) { self.cells[x][y].marked = true; for (direction in [0,1,2,3]) { var vector = self.getVector(direction); mark(x + vector.x, y + vector.y, value); } } } var islands = 0; for (var x=0; x<4; x++) { for (var y=0; y<4; y++) { if (this.cells[x][y]) { this.cells[x][y].marked = false } } } for (var x=0; x<4; x++) { for (var y=0; y<4; y++) { if (this.cells[x][y] && !this.cells[x][y].marked) { islands++; mark(x, y , this.cells[x][y].value); } } } return islands; } // measures how smooth the grid is (as if the values of the pieces // were interpreted as elevations). Sums of the pairwise difference // between neighboring tiles (in log space, so it represents the // number of merges that need to happen before they can merge). // Note that the pieces can be distant Grid.prototype.smoothness = function() { var smoothness = 0; for (var x=0; x<4; x++) { for (var y=0; y<4; y++) { if ( this.cellOccupied( this.indexes[x][y] )) { var value = Math.log(this.cellContent( this.indexes[x][y] ).value) / Math.log(2); for (var direction=1; direction<=2; direction++) { var vector = this.getVector(direction); var targetCell = this.findFarthestPosition(this.indexes[x][y], vector).next; if (this.cellOccupied(targetCell)) { var target = this.cellContent(targetCell); var targetValue = Math.log(target.value) / Math.log(2); smoothness -= Math.abs(value - targetValue); } } } } } return smoothness; } Grid.prototype.monotonicity = function() { var self = this; var marked = []; var queued = []; var highestValue = 0; var highestCell = {x:0, y:0}; for (var x=0; x<4; x++) { marked.push([]); queued.push([]); for (var y=0; y<4; y++) { marked[x].push(false); queued[x].push(false); if (this.cells[x][y] && this.cells[x][y].value > highestValue) { highestValue = this.cells[x][y].value; highestCell.x = x; highestCell.y = y; } } } increases = 0; cellQueue = [highestCell]; queued[highestCell.x][highestCell.y] = true; markList = [highestCell]; markAfter = 1; // only mark after all queued moves are done, as if searching in parallel var markAndScore = function(cell) { markList.push(cell); var value; if (self.cellOccupied(cell)) { value = Math.log(self.cellContent(cell).value) / Math.log(2); } else { value = 0; } for (direction in [0,1,2,3]) { var vector = self.getVector(direction); var target = { x: cell.x + vector.x, y: cell.y+vector.y } if (self.withinBounds(target) && !marked[target.x][target.y]) { if ( self.cellOccupied(target) ) { targetValue = Math.log(self.cellContent(target).value ) / Math.log(2); if ( targetValue > value ) { //console.log(cell, value, target, targetValue); increases += targetValue - value; } } if (!queued[target.x][target.y]) { cellQueue.push(target); queued[target.x][target.y] = true; } } } if (markAfter == 0) { while (markList.length > 0) { var cel = markList.pop(); marked[cel.x][cel.y] = true; } markAfter = cellQueue.length; } } while (cellQueue.length > 0) { markAfter--; markAndScore(cellQueue.shift()) } return -increases; } // measures how monotonic the grid is. This means the values of the tiles are strictly increasing // or decreasing in both the left/right and up/down directions Grid.prototype.monotonicity2 = function() { // scores for all four directions var totals = [0, 0, 0, 0]; // up/down direction for (var x=0; x<4; x++) { var current = 0; var next = current+1; while ( next<4 ) { while ( next<4 && !this.cellOccupied( this.indexes[x][next] )) { next++; } if (next>=4) { next--; } var currentValue = this.cellOccupied({x:x, y:current}) ? Math.log(this.cellContent( this.indexes[x][current] ).value) / Math.log(2) : 0; var nextValue = this.cellOccupied({x:x, y:next}) ? Math.log(this.cellContent( this.indexes[x][next] ).value) / Math.log(2) : 0; if (currentValue > nextValue) { totals[0] += nextValue - currentValue; } else if (nextValue > currentValue) { totals[1] += currentValue - nextValue; } current = next; next++; } } // left/right direction for (var y=0; y<4; y++) { var current = 0; var next = current+1; while ( next<4 ) { while ( next<4 && !this.cellOccupied( this.indexes[next][y] )) { next++; } if (next>=4) { next--; } var currentValue = this.cellOccupied({x:current, y:y}) ? Math.log(this.cellContent( this.indexes[current][y] ).value) / Math.log(2) : 0; var nextValue = this.cellOccupied({x:next, y:y}) ? Math.log(this.cellContent( this.indexes[next][y] ).value) / Math.log(2) : 0; if (currentValue > nextValue) { totals[2] += nextValue - currentValue; } else if (nextValue > currentValue) { totals[3] += currentValue - nextValue; } current = next; next++; } } return Math.max(totals[0], totals[1]) + Math.max(totals[2], totals[3]); } Grid.prototype.maxValue = function() { var max = 0; for (var x=0; x<4; x++) { for (var y=0; y<4; y++) { if (this.cellOccupied(this.indexes[x][y])) { var value = this.cellContent(this.indexes[x][y]).value; if (value > max) { max = value; } } } } return Math.log(max) / Math.log(2); } // WIP. trying to favor top-heavy distributions (force consolidation of higher value tiles) /* Grid.prototype.valueSum = function() { var valueCount = []; for (var i=0; i<11; i++) { valueCount.push(0); } for (var x=0; x<4; x++) { for (var y=0; y<4; y++) { if (this.cellOccupied(this.indexes[x][y])) { valueCount[Math.log(this.cellContent(this.indexes[x][y]).value) / Math.log(2)]++; } } } var sum = 0; for (var i=1; i<11; i++) { sum += valueCount[i] * Math.pow(2, i) + i; } return sum; } */ // check for win Grid.prototype.isWin = function() { var self = this; for (var x=0; x<4; x++) { for (var y=0; y<4; y++) { if (self.cellOccupied(this.indexes[x][y])) { if (self.cellContent(this.indexes[x][y]).value == 2048) { return true; } } } } return false; } //Grid.prototype.zobristTable = {} //for //Grid.prototype.hash = function() { //}

2048.xlsm

The Excel application itself can be downloaded from Google .

The application functionality is described in a previous article, where AI plays using the Monte Carlo method . Today’s solution has been added to the existing Monte Carlo.

All articles of the AI and 2048 series

Monte Carlo
Minimax + alpha beta clipping
Waiting for maximum
Neural network

AI and 2048. Part 2: Minimax + alpha beta clipping

Monotone

Smoothness (smoothness, evenness)

Loose tiles

Small change

Minimax

Alpha beta clipping

Minimalax recursive function with alpha beta clipping

Static Evaluation Function

Monotone

Smoothness

Empty cells

Maximum tile

Siberian Federal District for 2048

2048.xlsm

All articles of the AI ​​and 2048 series

More articles:

All articles of the AI and 2048 series