Recently, you may have noticed that recently the server has had a steep decline in stability.
We take server stability very seriously, and it's a real shame that we aren't able to fulfill our goals for stability.
We know these crashes are extremely frustrating for our players, especially those who lose items during the crashes. Trust me, they are extremely frustrating for us too. It seems like every time I look away for any extended period of time, something has gone wrong.
Today, I am going to write about the each individual crash, why they happened, and how the team is planning to solve these problems in the future.
Server crash explanations
In the past 2 weeks, the server has crashed 5 times. This means, during a working day there is a 50% chance that the server will crash.
A 50% crash rate is certainly not something we are proud of.
Here is a list of the last 5 crashes, and their causes.
: Unexpected server maintenance (to patch Spectre and Meltdown). Our server host did not notify us in advance about this.
: Database server stalled due to resource exhaustion.
: Login server stalled due to resource exhaustion.
: Database server stalled due to resource exhaustion.
: Unexpected Windows OS level error, probably caused by resource exhaustion but not verified.
Pet lag (or rather, the sinister database stall)
You may have experienced database server stalling first hand - it's what is known as "pet lag". This "pet lag" is much more sinister than it may look at first glance.
The reason people call it pet lag is because summon times take an extremely long time to finish, the cause of this is the database server failing to respond to queries in a timely manner.
This lag also affects logging in, saving bank info, changing channels, and basically anything that touches the database.
The game server has a very aggressive cache on player data, so you don't notice it while playing until you summon a pet usually.
If pet lag is preset, the server is on a downward spiral towards an imminent full crash - where you will not be able to play at all.
Luckily, pet lag takes a long time to fully manifest into a full crash, so you may notice we do not restart the server until it gets extremely bad (and usually right before the full crash).
Unfortunately, this causes players to have a less than desirable experience - but it's our only choice. If we immediately restart the server every time the database starts to stall, we would be restarting the server every few days. We choose the lesser of two evils.
In the event of a full database server crash, you will be unable to login (it will stay on the login screen for a very long time and never finish).
Additionally, all unsaved player data is lost permanently and rolled back to a previous state.
Login server stalling
This is simply caused by extreme resource exhaustion. Mabinogi services do not recover well from failure, and under immense pressure, servers will occasionally lose connectivity with eachother.
The moment the login server loses connectivity with the server coordinator or database server is when it's stalled until manual reboot.
You can tell this has happened when you attempt to authenticate and you're immediately met with a "Unable to connect to server" error. The login server will not accept new connections when stalled.
We're using Windows 2003 for the game server. Not much more to say here. The software is outdated and occasionally misbehaves, especially under high load.
We use this because the Mabinogi server we use (G13 specifically) was designed for Win 2003 - it isn't very stable in other versions of Windows. Just like the Mabinogi client isn't very compatible with Win8/Win10, the server's got the exact same problems (random freezes, etc)!
Again, we choose the lesser of two evils - unstable server software or unstable operating system. It just so happens that the operating system as a whole crashes less than unstable server software.
This particular problem, though rare, is actually relatively simple to solve (in theory) compared to the other problems: just use a more stable windows version.
We are able to manually fix each crash we encounter on updated versions of Windows server, which we have done in beta testing - but we're never fully confident in that work, never really sure that we haven't missed some small case in the hundreds of thousands of lines of assembly code that we have to work with to fix such problems.
A common theme
You may have noticed a common theme by now..resource exhaustion. 4/5 of our server crashes were from resource exhaustion.
You're probably thinking, "Drahan, why exactly does MabiPro suffer from resource exhaustion in the first place?".
It's a good question - especially when you know how the Mabinogi server scales. It does not scale by player count or NPC count...it scales by NPC client count.
More players does not scale linearly to more resource usage.
Our actual resource usage, for the most part, is entirely static. It almost never changes.
Under this logic, there is absolutely no reason that we should be suffering from resource exhaustion in the first place..
It all boils down to one thing: the way we host and finance the server.
MabiPro is actually under an extremely low budget, and as such, we cut corners financially...a LOT.
The biggest corner we cut is our dedicated server hosting.
To cut dedicated server costs, we've got connections with a server hosting coalition that shall not be named; and we cut a deal with them.
Not only do we get an extreme discount for dedicated shares of their servers (much cheaper than what we can buy publicly), we are also allowed to use shares of the server that are unused, as in not being used by paying customers
currently, free of charge.
We're using these unused server shares to power almost 80-90% of our server.
(Those of who you are reading who are knowledgeable in the server hosting industry - no, we're not technically using a dedicated server, it's just easier to explain it that way)
What's the catch?
You see, nothing is really
free in this world. We save a lot of money from this deal, but it comes at a cost: server stability
The problem is when a paying customer decides that they want to use the "unused" resources (read as, the resources MabiPro is using) that they deserve.
These resources are ganked away from MabiPro instantly, and allocated to that paying customer until they're done using it.
After they're done using it, we gain those resources again.
(For those with knowledgeable in the server hosting industry - yes, this is a form of overselling. It isn't good.)
However, Mabinogi does not respond kindly to it's resources being forcefully taken away.
This process will instantly make MabiPro crash for one reason or another.
Progression of the problem...
The more paying customers grow, the more often we get our resources (that we aren't paying for) forcefully taken away.
In the recent weeks, this has grown to a point where it may happen every other day due to simple growth of the server host.
In other words: we need to solve this problem, and we need to solve it fast.
Really, the solution to a large majority of our problems is so simple, any of you could figure it out.
We just need to rent a new dedicated server, really.
This is what we plan to do in near future.
This will increase our operational cost by a significant amount, and for the most part we pay for the server out of pocket.
Fortunately, we have a good sum of Bitcoin donations (thanks to our generous community) that we have kept for a long while - we're going to use these funds to pay for the extra expenses incurred by renting a new server.
If we decide to go through with this plan, we will need continued support from the community in the long run.
We will try our best not to let you guys down, as long as you don't let us down.
Hopefully you enjoyed my large wall of text.
4 comments (last by Drahan). Log in to post a comment.
Content Poll #4
Posted on 01-01-18, 02:48 am by Arisa
It's time for Content Poll #4! The questions will be posted below in advance of voting beginning. All questions are Yes or No answers, and you will be free to skip any of them that you're not sure about. Voting will begin ingame on January 3. As usual, for a question to pass, at least 75% of voters must have said Yes.
Please note that votes are always audited and that voting with alts is not allowed.
First off, there's some questions about increasing stack sizes! We propose increasing the stack sizes of ores from mining significantly, because currently it doesn't make much sense to mine when you can use metallurgy to get much more compact ores (and gems!). Also, in the last content poll, it was voted that alchemy crystal stack sizes should be doubled. This poll will decide if they get doubled yet again.
Should the stack sizes of wheat and barley be increased to 10?
Should the stack sizes of all ores (not ore fragments) be increased to 50? This would make them slightly more space efficient than ore fragments.
Should the stack size of mana preservation stones be increased to 25?
Should the stack sizes of all alchemy crystals be doubled for a second time?
Followed by some questions about cosmetic changes to spirit weapons:
Should it be possible to dye spirit weapons?
The glowing effect of an equipped spirit weapon changes color based on its social level. Should the rate at which this color changes be increased?
Historically the Refining skill has been notorious for being difficult to rank because of the failures required. We have the ability to make failures less important to rank the skill:
Should failures be made much less critical for ranking the Refining skill?
Previous content polls only made Composing ranks accessible one at a time. Why don't we just do all the rest and get it over with?
Currently, the Composing skill is capped at rank 5. Should the remaining skill books needed to reach rank 1 become available? They would be found as drops or rewards rather than bought from shops.
It is a minor change, but it may be convenient for characters starting in places other than Tir Chonaill to have access to skill resets without needing to travel to Duncan. We propose allowing Alexina, Castanea and King Krug to perform skill resets as well as Duncan. Additionally, if you have a character at total level 100 then the game considers you to be a seasoned player - but since on MabiPro many starter skills start at rank E, this can be problematic for assistant characters created after reaching total level 100 with your human. We propose checking if the character itself is at least total level 100, rather than if the account has any character that is at least total level 100.
Currently, Duncan is the only NPC that can reset the skills of a player who has not yet reached total level 100. Should Alexina, Castanea and Krug also provide this service?
Should it be possible to reset the skills of a character whose total level is less than 100 even if there is another character on the same account that has reached a total level of 100?
It has been suggested that ancient medals from the elf/giant trans quests become possible to trade:
Should it be possible to trade Ancient Medals with other players?
For some reason, new players start with the Blaze skill at rank E. None of us can remember why that is, but Blaze at rank E has a higher skill CP than any other starter skill does at rank 1. We propose making it possible to remove Blaze from your character. You would be able to ask Shyla to remove Blaze for free.
Should it be possible to voluntarily unlearn the Blaze skill if it has not been ranked past rank E? Players can reacquire Blaze by completing the questline for obtaining the skill. If this questline is completed, Blaze cannot be removed. The amount of AP required to rank Blaze from F to E will be refunded if rank E Blaze is removed from a character. Blaze will no longer appear as a starter skill for new players.
This next question is pretty self explanatory, but we may impose a small minimum donation amount to receive the title. Contributions to Patoots' editor fund will count.
Should players who have donated to the server receive a special title? The title will be purely cosmetic and can be worn in addition to another title.
Next are some various buffs to combat skills that were requested by players. The changes to Taunt and Sand Burst would fix a problem where it becomes impossible to train the skill after reaching a certain CP.
Should the distance from which Hailstorm can reach enemies be increased?
Should the time it takes to set up a tower cylinder be reduced from 5 to 3 seconds?
Should the Taunt skill affect enemies that have higher CP?
Should the Sand Burst skill be able to blind enemies that have higher CP?
Some proposals to make items available in shops:
Should trade unlock potions be added to Shyla's shop?
Should it be possible to buy alchemy crystals from shops in locations other than Tailteann or Tara?
And finally, some cosmetic changes:
Should metallurgy spots be given name tags so that they are easier to see during the day?
Should NPC "residents" appear in Dunbarton, like those in Tara, Tailteann, or Cor Village?
Should Nao's face no longer appear on the red moon Eweca?
Should it be possible to dye gold pouches and item bags?
Feel free to discuss these questions in the comments of this post! Voting will begin ingame on January 3, and continue for at least a week.
30 comments (last by meanie). Log in to post a comment.