Ok, I'm not sure anyones realised this, but a lot of the discussion on the last page has been had before, on the first two pages of this thread. The idea being discussed is slightly (and only slightly) different, but the problems are the same. I've watched you circle some of the solutions, but also create more problems, so I'll try and fix some of them. As a bonus, I'll even try to include that pseudo code hypo's so desperate for.
SystemsWe're discussing multiple different systems here, they all interlink, but I think theres been some confusion about which ones we're discussing here and how they all relate:
Population dynamics - controls the number of individuals of each species in each location. That's pretty much it, but it ties into pretty much every other system below, making it seem far more complicated than that. It monitors/processes birth/death rates, transfer of compounds between species (predation), migration between regions etc.
Climate - controls temperature and light levels (among others) across the planet, which is fed into the population dynamics. Light levels affect the available energy, temperature may affect metabolism, but also survival chances.
Auto-evo - what we really seem to be discussing here, despite it not being an auto-evo thread (even if it is very closely linked). Auto-evo selects which species get to mutate, and decides which mutations to apply. That's it, details later.
Compound system - controls what compounds have what uses, which can be converted to eachother, etc. It's also the target of most mutations, either by making a process more efficient (by making an organelle better), or enabling a new one, or developing a new compound (toxin). Conceptually the connectiong between the compound system and pop-dyn/auto-evo is the simplest; technically it's probably the most complicated, I'll see if I have time to get into it here or not.
Stages of auto-evoThis has been mentioned here, but I'll mention it again to make the following clearer:
1 - The player plays for a while, auto-evo is basically idle, though we may pre- or post-process stuff from other generations
2 - A generation ends, either becuase the player reproduces and opens the editor window, or a certain amount of time passes, and we do an auto-evo step in the background
3 - Auto-evo is passed a list of all species populations and their current states
4 - It selects which of those species get to mutate. As discussed
here, this could be one species, many species, or even all or none of them. The probability of each species getting to mutate is seperate, and dependant on multiple factors including their population size, selection pressure (i.e.: how unfit they currently are), reproduction rate etc. If the mutation of another species had a particular impact on a species in the last round (someone gave an example of a predator evolving and it's prey not getting a chance), the affected species gets a much improved chance this round. As the discussion here centers around how to select mutations, I'm going to ignore this step and assume that every species gets to mutate every generation for this post.
5 - We take the information about how each species performed last round and decide which mutation is best for it now. In some circumstances it may get to pick multiple mutations, but again for simplicity when explaining I'll assume it only gets one. This is what we're discussing here, so details later.
6 - All the mutations are applied. By doing this after step 5, the order in which species mutate this step has no effect on what they decide to do.
7 - We generate new meshes and animations as needed, and load back into the game.
Apart from the generation of new meshes and animations, all of this is computationally very cheap, and can be done in the background, with biomes the player isn't currently observing being processed while they're playing.
A note regarding 4 - some of the factors for picking who gets to mutate may now be used (in the microbe concept we're discussing in the other thread) to decide how many MP each species gets to work with, so we need to double check this and make sure we're not using the same factors.
Selecting a mutationDani's post was a pretty good explanation of the current system, except in one detail - how we calculate fitness. This has been our problem from the start, and we've made several attempts at solving it, and come pretty close. The solution (I hope - there's no way of knowing for sure until we actually test it) is what was discussed on the first page of this thread, and pretty much just been discussed again on the last page or two, with a few alterations.
I'm going to go over a few of the suggestions since my last post yesterday, some are pretty good, but others have flaws which haven't been pointed out yet.
I'll start with Calli's post last night and work from there. I'm going to ignore whatever questions were left unanswered up to that point, becuase whatever answers hypo was after, you didn't do a very good job of asking the right questions - probably why it keeps looking like we're not reading half your posts, becuase we're simply not seeing whatever question or point you think was implied.
- hypoxanthine wrote:
- 1) creature chosen for mutation by whatever method you like
2) random genome parameter selected to mutate
3) random increase or decrease of this parameter
4) depending on the genome parameter modified, increase or decrease one of the population parameters for the species (say were using Seregon's split-parameter version of Lotka-Volterra, if the speed of travel is increased by lengthening of legs, which youd test by doing just one quick simulation to measure running speed, then youd increase one of the parameters that make up the overall efficiency of predator-prey-encounter conversion to predator-individuals, like the probability of hunt success)
5) run population dynamics
6) land player on planet
7) probability of creature appearance based on number from population dynamics (in some way i havnt decided on, also based on area of biome and another pop. parameter of distribution)
Here we're just applying a completely random mutation (3-4) and seeing what happens. Yes, we update the pop-dyn (4) so that it will have some effect on population levels, but theres no actual selection going on. That is, unless you expect pop-dyn itself to take care of that selection, which it isn't really designed to. Yes, poor mutations will lead to a loss of population, but that loss of population doesn't really feed back into what mutation is chosen, you'll just get loads of very random species, most of which will die out.
We'll eventually get a good species by blind luck, but we're missing most of the aspects that make genetic algorithms effective - species don't mate with eachother, they don't reproduce asexually to produce new species (except when sub-populations speciate, but that doesn't help here). Effectively this is an GA with no interaction between agents, also known as blind trial & error.
- hypoxanthine wrote:
- 1) select rand species to mutate
2) select rand popdyn parameter and increase by small rand amount
3) each popdyn parameter change has a genomic interpretation. if you split the parameters up more, youll eventually get things like 'speed, stealth, etc.' so if you'd just increased the speed popdyn parameter for example, this would change the leg randomly, then run a simulation to determine if speed had increased and then hill-climb until it actually has. stealth? no simulation needed here probably - hill-climb to a set colour for the particular biome (simplistic i know). you can imagine what else there would be.
This is close, but backwards and a little poorly defined. What your saying is that we first pick a random pop-dyn parameter to mutate, then pick a random associated genome/phenome trait to mutate to achieve that mutation, and then test to see whether that genome mutation actually achieved the desired pop-dyn mutation... that seems like a hell of a lot of work compared to the exact opposite:
1 - select species
2 - select a genome trait to mutate
3 - test what effect this has on the pop-dyn parameters, and what affect this has (I'll get to this later in the post) on species fitness
4 - if it's positive, accept the mutation.
Ideally, I'd extend this to trying every possible genome mutation (or atleast a large subset), and picking the most favorable one. That is your hill-climbing algorithm: pick the best possible mutation at every step. How feasible this is depends on the size of the genome and the computation expense of testing fitness, neither of which we're sure of right now, but I'm fairly confident that this is atleast feasible.
- hypoxanthine wrote:
- imagine a predator mutates and is now much better at hunting. unless the great random mutates the prey in time, theres a significant danger of extinction
I mentioned this in step 4 of the auto-evo process above. When species get mutated is an entirely seperate issue to how they get mutated, so we'd have this problem regardless of the system. Except that we have a solution, discussed in the thread linked in 4 above.
- hypoxanthine wrote:
- a good fix for this might be to weight the probabilities of mutation for a species by its current population count (smaller population => greater selection pressures => mutants are selected for more strongly on an agent level => mutant allele frequency increases much more quickly on population level => so on population level, increase the mutation rate
Pretty good, and similair to whats discussed in the linked thread in 4.
- hypoxanthine wrote:
- how is this preference decided for each predator
Prey preference should be an evolvable trait. I think Scio mentioned that we will have a food web for each biome, so we know who is capable of predating on who, all we need to know then is to what degree each predation actually takes place. This is partially preference, partially both predator and preys abilities to detect and avoid/find/chase/kill eachother, and partially availability (the higher the population densities of both predator and prey, the higher the rate of 'collisions' between them, and the higher the interaction rate, regardless of whatever else either species does to modify that rate). The degree to which each predation actually occurs, and how succesful it is, affects the fitness of both parties, and that degree therefore also decides how much of an impact that particular interaction has on each parties fitness, and how important mutations affecting that interaction are.
- ~sciocont wrote:
- We're not idiots.
QFT
- ~sciocont wrote:
- In this way, instead of changing the form of an organism and figuring out the downstream effects, we change the properties of the organism and then mutate the form to reflect that. It gives us a lot more control over the form of the organism and thus an extra layer of insulation against the "ugly creatures" problem.
As nice as this sounds, doing it simply isn't trivial. If we're incapable of figuring out the downstream effects of changing an organism (as suggested here and elsewhere), how do we go about changing those downstream properties, then going back and finding modifications which have the desired effect? The latter is only possible if the former is, and the former is a lot less contorted.
- ~sciocont wrote:
- One interpretation of your popdyn parameter change is to simply boost the population (in the next generation) of the individual selected for mutation by some multiplier (5% or something, we can tinker with it based on the orgs population as a fraction of the community (biome) population) and then find a parameter change that might increase population by that amount. One problem I see with this is that, unless we have many different popdyn parameters for each species, mutations will be more or less arbitrary. Of course, if we're modeling every meaningful two-species interaction, then we need only add the multiplier to one of these parameters and then we have a more specific needed mutation.
I'm interpreting this as "boost a species fitness by an arbitraty value, and then find a mutation which would justify this boost", and it has the same issues as mentioned above. The problem of not having many pop-dyn parameters really is a problem, and we can't expect to have meaniingful or interesting auto-evo with the very few parameters used earlier on in this thread. I mentioned as much then, but only said that we'd need more than 7. We will need a lot more than that, representing things as specific as organism speed, mass, metabolic rate etc. I'll get back to this idea later.
- hypoxanthine wrote:
- 1. select a random organism to mutate
2. randomly mutate two organs (or other) in said organism (these can either harm or benefit organism, it does not matter, the only required part to this is at least one be beneficial.
3. run simulator with both mutations added multiple times.
4. take results and put successes out of a whole of the simulation. (delete if success is 0 out of total and restart, unless of course you wish to keep it for an extinction)
5.depending on how they do in said simulation will determine the overall population change.
6. back in the environment the results will be reflected by how the population changes.
Apart from mutating two traits/organs at a time (why?), I really don't see how this is at all different to what Dani explained on the previous page - you do a random mutation, test if it's benificial, accept or reject it. The only real addition is that the performance in the fitness tests now translates to a population boost (is this a boost to population, growth rate, or what? you don't specify). Simple as that may be, the net effect is that our population simulations no longer obey the conservation of energy. If we give species arbitrary boosts to growth rates, we're giving them energy which they're not obtaining from a source (predation/consumption of another species or photo/chemosynthesis). We could attempt to balance this arbitrary change somehow, but that gets complicated.
- Daniferrito wrote:
- Actually, i dont think selecting a random parameted to addjust would suffice. Modifying any part of an organism has usually more than one parameter affected, which usually is the energy it takes to create a new creature (as it either became more or less complex or big) and another parameter depending on the specific part.
Good point - another reason we can't choose one pop-dyn parameter to mutate, then look for a genome mutation to achieve that, there would be too many side effects, some of which may make an apparently positive mutation negative overall.
- hypoxanthine wrote:
- if your worried about then why not have associated populome (bugger this im coining this term now) changes for each genome change
Getting warmer...
- hypoxanthine wrote:
- instead of having just one genomic interpretation of a populome change, have more than one associated genome change.
e.g. populome change: speed increased => out of possible genome changes for this event: longer legs selected (then sim to ensure this actually does increase speed). => genome change: more energy required for movement etc.
...but your still working backwards.
- ~sciocont wrote:
- This should mean that the data for the organism's abilities is stored separately from the data used for its physical model and AI behavior
Do you mean abilities as in speed, strength etc., which are effectively the result of it's physical genome (leg length, muscle mass etc.)? If so, yes, there should be three layers of data:
1 - the genome, which is mutated by auto-evo. We have no way of assessing how fit a genome is without looking further.
2 - the abilities, or phenome (phenome in a different sense to how i was using it yesterday), which are the result of the genome. These are the result of either mini-simulations, or calculations.
3 - the pop-dyn parameters, which are influenced by a species abilities. It's only from these, and their effect on
r, that we can actually calculate fitness.
- hypoxanthine wrote:
- so, as weve discussed, its effectively a hill-climbing algorithm. that immediately gives us a problem because hill-climbers are greedy, so they tend to get stuck
Yes, hill climbers get stuck, that's the advantage of GA's/simulated annealling/swarm sims etc. Simulated annealing may help, but all that really does is allow negative mutations at a small random chance. We already do this by having genetic drift - a small random change in a random trait. Also, I think your suggesting running this at every generation, effectively attempting multiple mutations until we either find a good one or give up and accept a bad one. If we allow this to go on too long, it would be simpler and a lot more efficient to simply try all mutations and pick the best one, or with small random chance pick a bad one.
Also, I really don't see the problem if it gets stuck. If the system finds a very effective genome (e.g.: a shark), why shouldn't that species survive unchanged for a very long time? The problem is if every species does that, and we end up with a static ecosystem, but given that this would require the environment to be static too, it shouldn't really happen. With a seasonal climate, long term climate change, catastrophes, and whatever changes the player makes, there should always be something for the AI species to adapt to, so they should become stationary. The simulations on the first page show that even in a trivially simple system, adding seasonality can delay stabilisation from a few 'months' to hundreds of 'years'.
Finally, if a species gets stuck in a local maximum, it may be unable to adapt to long term changes in its environment. If it doesn't get out of that maximum and find a better one before it's too late, that maximum may get to a point where it can't sustain the population, and it goes extinct. Tbh that's just realistic, species ussually go extinct becuase they're unable to adapt.
Pieces of the puzzleRight, now I'm hoping to show how we've already solved most of these issues before. A lot of this comes from the various auto-evo threads, and a lot is relatively new developments from various discussions I've had with Scio and Calli (both on this thread, elsewhere on the forum and off it) and a lot of thinking over the past few months. I really wasn't ready to present this yet, partially becuase theres a lot of more basic things I need to introduce first (which I will briefly do here), and partially becuase there are still some issues I haven't had time to fix (which I'll point out, so we can hopefully work on solving them). First, back to the seperate pieces, from the bottom up:
Dynamic systemsEach organism is a collection of compounds, which are managed by the
compound system. Organisms gain compounds from their environment, and other organisms, they process them internally using organelles or other processes, and they excrete waste compounds back into the environment. Some compounds are necessary for survival, others are harmful, an organisms basic aim in life is to have the right mix of compounds available to allow it to survive and reproduce.
At the population level, species can be represented as the number of individuals in an area, and the average compound contents of it's individual organisms. The compound system again manages these compounds, but the population is managed by
population dynamics, which takes information from the compound system. If a population has insufficient access to a particular compound (e.g.: sugar), all it's members will suffer and the population as a whole will suffer starvation, reducing birth rates and possibly increasing mortality.
In addition to information from the compound system (the
internal processes affecting a species' members), population dynamics also takes information from
external processes, including predation, other interactions (e.g.: symbioses, competition) and environmental drivers (e.g.: light and temperature levels). All of these factors have some effect on a species ability to survive and reproduce, but it's far from sufficient to simply talk about birth and death rates, we need to differentiate (in the non-mathematical sense) these parameters into more specific parts, e.g.:
- Species A mortality due to predation by species B
- Species B energy/compound gain due to predation of species A
- Species C (a plant) energy gain due to photosynthesis
- Species A mortality due to exposure to envionmental hazard (toxin, extreme temperatures etc.)
- Energy expended by species B when searching for prey
- Energy expended by species A/B during a predation encounter (the chase/fight)
- Energy expenditure required for species A to reproduce
This list is far from complete, and deciding all the necessary parameters and related equations won't be easy. This, along with a similair list of 'abilities' I'll mention later, are the two key issues we need to overcome to make this work. They're certainly not impossible though, just hard work.
Now we have something which is far more than a simple logistic equation or predator-prey model. It incudes those two things, but also many more factors, in calculating population levels. This appears overcomplicated for this purpose, but that's not the point. Knowing the population level of each species in each location is a side effect of the population dynamics system, what we actually want is a way of calculating species fitness. The simplest measure of fitness is r, or (b-d) as explained on the first page of this thread. It's pretty clear that it won't be any where near that simple with the parameter list above, but calculating it will be just as simple to the computer.
DataNow the other half of the puzzle. Each species has a
genome, all members of a species in a particular location/biome (a population) share the same genome. Populations of the same species in seperate locations may differ slightly, and may eventually speciate if they're seperated for too long, but
speciation and
spatial dynamics (migration) are beyond the scope of this discussion. This genome controls the placement and number of organelles and limbs, the efficiency and size of those organelles, the efficiency of processes not associated to any organelle, the compounds available to the organism.
Derived from this information, and the definitions of each genome, compound, process and what each can do, is the organism/species'
phenome. This phenome record the abilities of the organism. For example, if the genome specifies 3 mitochondria with an efficiency of x%, and the mitochondria definition says at that it can produce 1mol ATP per second at 100% efficiency, then the phenome records the ability to produce ATP from sugar at a maximum rate of 3*x moles of ATP per second (depending on the availability of oxygen + sugar). If the organism has the appropriate organelle (golgi apparatus + vesicles?), and the ability to produce a particular toxin, then the phenome records the ability to produce that toxin at a rate dependent on the efficiency of the golgi, and at a strength dependent on the level of the toxin. The phenome will also have a value for the organisms maximum speed, dependent on the types of motion available (flagella, cilia, pseudopodal), their number and efficiency, as well as the size and mass of the cell. Again these are examples from an incomplete list.
We also have a budget of mutation points, or genetic diversity, to spend on mutations. How exactly these will work is still being discuss in another thread, so I'll just assume they're limited, and that they may be split into seperate pools for different types of mutation.
Finally we have data both from the envionment and the Population Dynamics system: the number of each population, the density (dependant on the area of each location), acidity, temperature, light, environmental compound reserves etc.
Calculating populationNow we have all the pieces we need to calculate population. The population dynamics equations use the environmental, population and compound level data (variables), and each species' phenome (parameters) are combined to calculate a species reproduction and mortality rates, and therefore the change in its population for the next generation. An example:
From populations, we know the density of species A (predator) and B (prey), and therefore how often they are likely to meet eachother. From species A's phenome we know how likely it is to detect B upon meeting it, as well as how capable it will be of chasing B down. Combining this with B's phenome telling us how well it can escape or fight A off, and some other details from each phenome, we can calculate the chance of a meeting resulting in an individual of B being succesfully brought down. We also calculate how much energy this hunting process (succesful or not) costs each species, each species loses this much energy regardless. We then multiply the encounter chance by the chance of a succesful kill to find an overall predation rate. Population B loses members, and their embodied energy and compounds at that predation rate. Population A gains some of those energy and compounds at a rate depending on their digestion efficiency and their ability to defend the kill and consume it before being chased off. What doesn't get eaten by A is returned to the environment as a carcass, which may be decomposed or scavanged, whatever material isn't digested by A is returned as faeces. This is only one interaction between two species, there will be many more between both species and the environment.
All of this information is fed back into the next generation as changes to species populations, their compound reserves, and the environment - some of the data in the previous section.
Auto-evoFinally we come to the question of mutations. Everything else about auto-evo is pretty irrelevant at this point - we've picked a species, possibly becuase of a mutation to another species, possibly at random; we may or may not also apply a completely random 'genetic drift' mutation - all we're worried about here is picking the best possible mutation given all the information we have.
- What we do is pick a set of traits in the genome, this may be small or may include all traits depending on how computationally intensive this all turns out to be.
- For each one we then increase and/or decrease it as far as our mutation budget allows(for some upgrades we simply can't afford, we stop here and move on to another trait). We also consider one genome where no mutation takes place.
- We then calculate the resulting phenome for each such possible change.
- We calculate the predicted population growth rate in the next generation for each phenome.
- Whichever change results in the greatest population growth provides the best fitness boost, so we choose that mutation. If all growth rates are negative, we choose the least negative. These population growth 'estimates' are calculated exactly as in the previous section, and include input from interactions with other species and the environment. Phenomes for other species are kept at that set at the last generation, so all species have the same information about eachother.
Steps 3 and 4 are the difficult bit. Wherever possible we need to find equations for these calculations, and I'm fairly confident that we can do this for most if not all situations. Where we can't, we resort to the mini-simulations discussed in the previous few posts, but that should be a last resort.
If your wondering, steps 2 and 5 are the hill climbing parts; in 2 you choose a few possible directions, then in 5 you find out which of those directions is uphill. I have just realised that this is a very naive hill climbing system, and we might be able to reduce the number of mutations we need to test drastically by choosing a better one (Nelder-Mead comes to mind). However, I need to think this over some more, and even if we can't I'm really not that worried.
At this point all that's left is to make the chosen changes to the genome and update the phenome. Once all mutations for this generation are done we calculate population growth rates for each species via standard population dynamics and update all the population levels, along with compound levels in species and those in the environment. The last step is to create any new meshes and animations needed, then go back to gameplay.
SummaryI'm not sure if there's much more to say. I've probably made a few mistakes here, it's gone 5am and I've been writing this since before midnight, so I'll try and clear up anything obvious tomorrow. I also mentioned at the start that I was far from ready to present this, so not everything is as well developed as I'd like.
I should also point out that while there are bits of this which would be difficult to program, theres nothing here which I couldn't program (and I'm definately not the best programmer here), and an aweful lot of it is similair to the sort of stuff I do every day, so it's certainly possible. I can provide much more detailed program flow and pseudo code when needed, but that really isn't a priority right now.
Again, a lot of whats been discussed here the past few days is similair to this. Also, of all the criticism the various ideas have recieved, the only truely valid one I can see is that this is really rather complicated. To that I can only say that I can't think of a simpler system which will produce similair or better results for a reasonable amount of computation (and thats not for lack of trying). Also, while the whole system is definately complex, the individual components are actually very simple, and even the ways they interact are fairly simple too.
A few points on computationI run these sorts of simulations and computations almost daily, and I know we can run probably hundreds of thousands of them per second. Given the few complicating factors (these equations are a fair bit more complicated, they'll be running in c++ which is a lot faster than what I use, we need to run a game at the same time) I'm pretty confident that computation isn't going to be an issue, so long as we don't resort to too many mini-sims.
In addition to this, we won't be running this in anywhere near realtime. Auto-evo only needs to run on each location maybe once every 2 minutes, if that? Locations not observable by the player can be run even less frequently, and be done in the background while they're playing, so that only the observable location needs to be run while the player is in the editor.
The biggest computational workload to auto-evo will still be the generation of any new meshes or animations, and this only really needs to be done for the observable location. Other locations can have these generated as and when needed as the player moves around.
Finally, this wasn't supposed to be a rant, sorry if some of it came accross that way. What it's supposed to be is a summary of most of the relevant concepts we've discussed on various parts of this forum about auto evo and population dynamics. A lot of this was discussed and practically decided in the first 2 pages of this thread, I've just filled in the blanks from what I've been thinking about over the past several months.
tl;dr - I wasn't BSing.