Day 14: Extended Polymerization

Day #14 problem description │ Problem input

This problem asks us to perform a sequence of transformations to a starting string and then compute the counts of characters in the resulting string. Specifically, we are given a map from pairs of characters to single characters. Each generation, each pair of characters in the original string is to have its entry in this map (if it exists) inserted between the two characters. This runs on a specified starting string for a specified number of generations, producing a final string.

My initial attempt involved actually building the string at each stage. Unfortunately, because the string approximately doubles in length in each step, it’s computationally infeasible to generate the string the requested 40 generations out, nor would we be able to store it in memory (2⁴⁰ is around one trillion).

The right way to do this problem is to maintain a tally of counts of each pair (c1, c2) of characters in the string. Each generation, we run through the pairs, and if the mapping contains a character c3 to insert, then we remove key (c1, c2) and assign its tally to keys (c1, c3) and (c3, c2). This doesn’t tell us what the resulting string is, but thankfully the problem doesn’t ask us about the string per se; it only asks for character counts within the string.

To count the number of times a character c appears in the string, we simply note that it is counted twice, once for each pair it is the first character of and once for each pair it is the second character of. So we just add up the number of times we see it in a pair and then divide by 2, right? Wrong! The first (respectively, last) character of the string is actually only counted once because it is only the first (respectively, second) character of one pair of characters, not two. The way we model this is maintain a tally not just of character counts, but of the positions of those characters. Our tally now assigns counts to triples (c1, c2, pos0) where pos0 is one of “start”, “middle”, and “end”. When we insert c3 between c1 and c2, we obtain (c1, c3, pos1) and (c3, c2, pos2), where:

pos1 is either “start” or “middle”; it is “start” if and only if pos0 was “start”, and
pos2 is either “end” or “middle”; it is “end” if and only if pos0 was “end”.

Now, only the characters occurring in the “middle” are double-counted, so the number of times a character appears in the final string is the number of times it occurred in a character pair in the “start” or “end” positions, plus half the number of times it occurred in a “middle” character pair. To make the counting easier when working with integers and floored division, we double the “start” and “end” character counts, add them to the “middle” character counts, and then halve everything at the end.

As you might have guessed, my first attempt at this problem did not keep track of pairs’ positions and so was plagued by an off-by-one error which only arose for specific inputs. (I believe it had to do with whether the final string started and ended with the same character, but I didn’t bother generating test cases to check this.) My first submission was too large, so I subtracted 1 from it and re-submitted and got it correct. This was the first problem of this year’s Advent of Code that I thought it was feasible for someone to arrive at the right answer with buggy code and some guesswork.

Setup

use crate::Answer;
use std::collections::BTreeMap as Map;

#[derive(Copy, Clone, Debug, PartialEq, Eq, PartialOrd, Ord)]
enum Position {
	Start,
	Middle,
	End,
}

#[derive(Debug)]
struct Polymer {
	template: String,
	mapping: Map<(char, char), char>,
}

impl Polymer {
	fn from_str(s: &str) -> Option<Self> {
		let mut lines = s.lines();
		let template = lines.next()?.to_owned();
		lines.next()?;

		let mut mapping = Map::new();
		for line in lines {
			let mut splat = line.split(" -> ");

			let mut outer_chars = splat.next()?.chars();
			let left = outer_chars.next()?;
			let right = outer_chars.next()?;

			let inner = splat.next()?.chars().next()?;
			mapping.insert((left, right), inner);
		}

		Some(Polymer { template, mapping })
	}

	fn get_initial_char_pair_counts(&self) -> Map<(char, char, Position), usize> {
		use Position::*;

		let mut ans = Map::new();

		// Iterate over adjacent pairs of chars. The last index iterated is `n_pairs - 1`
		// with `n_pairs == self.template.len() - 1`
		for (i, (c1, c2)) in self
			.template
			.chars()
			.zip(self.template.chars().skip(1))
			.enumerate()
		{
			let position = if i == 0 {
				Start
			} else if i == self.template.len() - 2 {
				End
			} else {
				Middle
			};

			*ans.entry((c1, c2, position)).or_default() += 1;
		}
		ans
	}

	fn apply_n_times(&self, n: usize) -> Map<(char, char, Position), usize> {
		use Position::*;

		let mut pair_counts = self.get_initial_char_pair_counts();

		for _ in 0..n {
			let pair_counts_vec = pair_counts
				.iter()
				.filter_map(|(&k, &v)| if v > 0 { Some((k, v)) } else { None })
				.collect::<Vec<_>>();

			for (key, count) in pair_counts_vec {
				let (c1, c2, position) = key;
				if let Some(&c) = self.mapping.get(&(c1, c2)) {
					let first_pos = match position {
						Start => Start,
						_ => Middle,
					};
					let second_pos = match position {
						End => End,
						_ => Middle,
					};

					*pair_counts.entry((c1, c, first_pos)).or_default() += count;
					*pair_counts.entry((c, c2, second_pos)).or_default() += count;

					*pair_counts.get_mut(&key).unwrap() -= count;
				}
			}
		}

		pair_counts
	}
}

fn get_ans(polymer: &Polymer, n: usize) -> usize {
	use Position::*;

	let char_pair_counts = polymer.apply_n_times(n);
	let char_counts = {
		let mut char_counts_2x = Map::new();
		for ((c1, c2, position), count) in char_pair_counts {
			let c1_multiplier = match position {
				Start => 2,
				_ => 1,
			};
			let c2_multiplier = match position {
				End => 2,
				_ => 1,
			};

			for (c, mult) in [(c1, c1_multiplier), (c2, c2_multiplier)] {
				*char_counts_2x.entry(c).or_insert(0) += mult * count;
			}
		}

		char_counts_2x
			.into_iter()
			.map(|(k, v)| (k, v / 2))
			.collect::<Map<_, _>>()
	};

	let mut max_count = 0;
	let mut min_count = usize::MAX;
	for &count in char_counts.values() {
		if count < min_count {
			min_count = count;
		}
		if count > max_count {
			max_count = count;
		}
	}

	max_count - min_count
}

fn ans_for_input(input: &str) -> Answer<usize, usize> {
	let polymer = Polymer::from_str(input).unwrap();
	(14, (pt1(&polymer), pt2(&polymer))).into()
}

pub fn ans() -> Answer<usize, usize> {
	ans_for_input(include_str!("input.txt"))
}

Parts 1 and 2

fn pt1(polymer: &Polymer) -> usize {
	get_ans(polymer, 10)
}

fn pt2(polymer: &Polymer) -> usize {
	get_ans(polymer, 40)
}