Methodology matters from simple math 001

Problems by themselves are fishes. Problem solving is fishing. We will talk about fishing using a simple math problem.

Yes, you can brute force to get the answer on this one. It will take you a while wouldn’t it?

The first approach starts from zoning, meaning first to identify a reasonable zone of upper and lower boundary. 2219006624 is 10 digits, where 105 is 6 and 1005 or 1010 is 11. This is your starting point. Another round of fine tune will land you in the 70 to 80 range.

With an even number ending, you can select from the set of 72, 74, 76, and 78. 76 should be an easy out because power of 6 will always ended with 6, not 4. Either sensitivity to numbers or a brute force with 2 candidates (if the first 2 are not the answer, the 3rd must be the answer) can get your the right answer, 74.

Me, on the other side, went to the second approach. I quickly sensed the uniqueness of a5 where the last digit is always the same as a itself. This means 2219006624 can only be a number ended with 4. The size of 2219006624 can lock the the answer among 94, 84, 74 or 64. This is where either a similar analysis from first approach, or a similar size of brute force can give you the right answer, 74.

There might be 3rd or more approaches. People always have special ways to get the right answer. Again, this post is not surrounded on math, but strategy and thought process. On my last slide, I summarized a few lessons learned:

  1. Layout a strategy first. No matter what kind of problem you are facing, having an early and well defined strategy is necessary. The effectiveness of the strategy will decide the outcome.
  2. Break the problem into components and steps. Divide and conquer. Those steps can be sequential or concurrent, those components can be inclusive or independent, but componentization is a common method to deal with a challenge at large scale.
  3. Execution. Plan does not solve problems itself, but guide the “how to”. It requires muscles and wills to get the job done. Good execution is always needed.
  4. Diversity. All roads lead to Rome. Different people are sensitive or well trained in different skills areas. They approach the same problem from different angles and can be effective in different ways. Combine them together, the team can perform at higher level.

This is common we need a similar problem solving approach in technology world. Zoning in is useful technique to narrow down options, root causes, or methods. Pattern (the powerful part of 2nd approach) is effective weapon in pinpointing “the thing” at fine grained levels. Use them both, a problem solver can be a sharp shooter.

With more examples coming, I hope this series serve some its purposes on how to solve problems. And the content is useful to you.

One flow, four perspectives of a same technology story

I posted a 4 episodes of different technology areas based on a simple data transaction. The story was broken down to various facets of subjects: data, stack and brands, cloud native alternatives, and cyber security focus areas.

You can view the 4 blog posts here:

  1. Data polymorphism: Data polymorphism
  2. Infrastructure ecosystem: Infrastructure Ecosystem
  3. Cloud native stack: Infrastructure Ecosystem – Cloud Native View
  4. Cyber security: Cyber security and infrastructure

One Page Press promotes a way of communicating complicate or abstract technology subjects in a human readable format. It strives for shorter verbiages and more visual presentation. One advantage of creating a base template is to illustrate the services and options from the same problem domain. In this example, we use generic data flow. It could be easily replaced with a specific business problem. One Page Press is a powerful way to bring vast information with a more cohesive way.

Using sunburst diagram to visualize goals, strategies, tactics, and actions

Tired of reading a lengthy strategy novel by your advisors? Toss it out.

This is what you can use. Everything is centered around the goal.

Strategy: Describe how to achieve the goal. Usually spans multiple years. Consider strategy the foundation pillars that hold up the mission.

Tactics: Describe the artful move to support a strategy. It has a shorter or limited lifespan inside a strategy. Some tactics do not have concrete actions, it might be an attitude, a norm, or an intentional no-action.

Actions: Concrete details of how to carry out a tactic. This is the combo of who, where, when, and much/many. Action is also built in to daily operation.

This is a fillable artifact of how to present a complicate combo of strategies, tactics and actions. It is weighted so each bar has a number to support. It is illustration only.

Private cloud connect considerations

What are the main factors for a multi-region cloud computing design?

  • Improve resiliency. Dividing compute units into regions allows for more uptime in case one of the cloud regions experiences outages or degradations.
  • Reduce latency. This allows data processing to be closer to the request, boosting performance.
  • Enable disaster recovery: Each processing site is fully stacked, hot-hot, and failover ready to handle unexpected events.
  • Support a geographically dispersed workforce and remote locations. Bring more and closer compute to a geographically dispersed workforce.

What do you need to be mindful of for a multi-region cloud design?

  • Gather traffic and usage data. This allows you to decide the size and location of your needed infrastructure.
  • Understand the latency issue. Work with your network providers on private connections and locations.
  • Understand the cost implications of multi-regional business and have a solid return on investment plan.
  • Prepare from the engineering and development sides to have the proper skills, patterns, and practices to handle the complexity.
  • Multi-region does not automatically mean twins or cloning. Understand your system and business, and choose only the applicable portion.

In this example, the cloud computing is based on two regions, one from Northern Virginia and one from Northern California. Key cloud workloads rely on the performance of direct connect (DX). With the same network speed, the distance between the DX location and the region of computation comes into play. Please see the example below. The metrics show the latency in milliseconds of a ping operation from a US East client site to a US East 1 cloud region. It varies from 2 milliseconds for an Ashburn, VA, location to 122 milliseconds for a Denver, CO, location, a 61-times difference. Another ping operation from the US East client site to the US East2 (Ohio) region shows that Ashburn suffered some more latency, but all other three locations decreased.

Research: Latency Test Result from Various DX locations and Two Different Regions

This data confirms that your DX location contributes to the overall latency of your cloud traffic. It is important to select the cloud region and DX location strategically. It should be based on your geographical application use and specific performance requirements. But one principle holds: you want your primary DX to be as close as possible to your main compute ecosystem.

90-90 rule on technology delivery

The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.

Tom CargillBell Labs

This 90-90 rule is simple. It exists prevalently in software development and technology delivery world. The essence is the last 10% of work takes another 90% of the time. So in reality, count your mid point when you feel you are 90%. This is due to fine craftsmen’s final touch is more complicate, but also due to unknowns, unexpected turns, and other compromises.

This is rather an interesting topic for project managers, leaders, and most importantly, executives, as they are the one who promise the final date. First of all, have your buy in on it. Denying 90-90 rule won’t do any good because 90% of time, it will happen. Secondarily, have a big picture in mind on what you can compromise. If you are in aerospace or construction industry, the likelihood of your customer accept a 91% built product is absolutely none, because no one will trade safety to a hasty rush. However, if your products are social media, causal software, or you are pursuing an emerging market, you might have reasons to let the product go to market first and fix them later. In this scenario, timely birth is beating anything else. Weigh your business properly from 90-90 rule.

For most others, we are dealing with this challenge through risk management, and delivery strategies and tactics. Throughout my career, I have been facing mission critical deliveries or rescues that meeting a timeline is a 100% must, in combination of following fun factors:

  • Technology unknowns
  • Key members deserted
  • Delay on resources (people, equipment, contract signing etc.)
  • Increased or last minute compliance scrutinizing
  • Integration unavailability

Therefore, without writing a book, here are a few tips on how I have emerged from those hazy problems repeatedly as I stick on the following principles:

First, do not skip planning and re-planning. As many people believe actions, only effective actions and smart strategy will do the job. Therefore, the first move is examining the situation, and redo the teaming (most time you will need it), a team who can win, and who wants to win.

If a rescue is needed, never skip re-planning.

Second, put an iron fist on scope. Scope is the universe of your problem. If scope can re-carved, under control, and without ad hoc executive interference, you have a better chance of succeed.

Scope is the biggest difference maker, at least freeze it.

Third, overwhelm the challenges with overwhelming resources in the early stage. Don’t sign up if no such support. This is to secure ahead of schedule delivery in the first 90% portion, which is the key. Overstaffing not only gives more expertise, but also boosts confidence to team members.

Even for a street fight, more hands are always helpful.

Fourth, know the absolute how, either by yourself or a trusted team lead. This is where technical skill plays vital role to reduce overheads, avoid dependency problems, and timing the integration.

No team can succeed without a technical soul.

Fifth, carry out a disciplined and high performance execution. This is at tactical and operation level.

With no extra days on the calendar, each day counts.

Last, focus on roll out strategy. Just like airplane’s taking off and landing are the two most important legs of its safety and operation, how to complete the work is why the last 10% requires a normal 90% time.

The most complications come at the completion time.

90-90 rule applies regardless of your software development methodology. Agile or waterfall does not matter either way. Work is work. 90-90 rule is more about the understanding and preparation for the last 10%. Similar to an NBA or NFL game, 90% regulation time (6 minutes left) usually leads to another 30 or 45 minutes in actual play, with timeouts, injuries, breaks, and unexpectable.

A strong, prepared, anticipating, and creative mind will prevail.

Cyber security and infrastructure

This is the last episode of the infrastructure series, after a theme on the data format, the technologies, and cloud native options, I use the same U shaped transaction flow and mark related cyber security practices on the map.

The cybersecurity industry uses a model referred to as the security triad, also known as CIA: Confidentiality, Integrity, and Availability.

Cybersecurity practices include organization compliance. policies, and operations to protect information from being stolen, compromised or attacked.

As a concept, this picture is self-explanatory. It concentrates core cyber practices in the middle as the strategic group, and places other tactical and operational items along with individual areas according to the technology stack.

Infrastructure Ecosystem – Cloud Native View

This is a continuation from my last post: Infrastructure Ecosystem. People ask if there is a solution that can simplify from dozen’s of service providers, demand on skill diversity, intricacies on licenses and cost, integration challenges, and the bottom line of keeping up the maintenance.

One of the solutions is cloud computing. You can pretty much shift the entire ecosystem or part of them to one of more cloud service provider. In the following diagram, I use AWS and its services as an example to indicate how to map a three dozen vendors to a single vendor. This is an extreme way of infrastructure engineering, and it might not cover all the edge cases.

Please note, appliance and market AMI (Amazon Machine Images) are always available options if you want to bring vendor specific solution into your cloud landscape. I repeated it in four places to emphasize that is your discretion. In most cases, you don’t want to depend on cloud services provider for everything native, you might want to leverage their infrastructure hosting, but keep the best fit solutions within the ecosystem.

To avoid this picture becomes too busy to view, I purposely omitted software as a service (SaaS) where your cloud native solution is tightly integrated with “other clouds” such as Salesforce.com, OKTA, Microsoft, Google, etc.

Is this your preference?

Infrastructure Ecosystem

Does this map look similar to you or stack at your organization?

This illustration is built on top of my last post: Data Polymorphism. In that blog, the U shaped computer to computer flow describes an end to end digital transaction and various format of data. Here, rather than expanding on data polymorphism, I listed technologies or brands of service. Each is placed next to the proximate positions through the data process. This is not intend to be a complete list.

If you are a chief technologist, or a leader in infrastructure and integration, you are dealing with this crowd. You might be tasked to paste the puzzles together with pieces of technology, acquisition, service level agreement, workforce skills, and forward-looking replacement. You might be happy with your current providers’ price and performance; or you might be locked-in to some degree, having difficulties on licensing negotiation, missing resources, end of life/end of support of things, or need a vision for modernization. You should start from a simple illustration like this and explore future options. This is the map of your infrastructure ecosystem.

Please ask yourself some simple questions. Do you want to continue with this harmonic symphony structure, or do you want to a smaller crowd with more streamlined tools and services? How much cloud you want to come into play? Do you prefer managed services over full infrastructure control? What’s your budget and timeline? Those are decision matter questions. In the next post, I will share the same ecosystem with a cloud native approach, using AWS as example. You can achieve the same goal with other cloud providers as well, or in a hybrid, multi-cloud architecture.

Data polymorphism

Data polymorphism illustrated via a simple transaction

In this diagram, I illustrated a journey of a simple customer information transaction from the end to end. It includes key compute and networking components and also various data formats of the same information:

  • HTML
  • Bytes
  • JSON
  • Java
  • XML
  • Relational database
  • Queue or Message

Technology Incident pyramid

Accident triangle, also known as Heinrich’s triangle or Bird’s triangle, is a theory of industrial accident prevention. It shows a relationship between serious accidents, minor accidents and near misses. You can also read the safe pyramid from this blog: https://www.oshaoutreachcourses.com/blog/safety-triangle-the-safe-pyramid/

Inspired by Heinrich’s triangle or Bird’s triangle, I created an alike pyramid to illustrate a similar concept applicable to the information technology and cyber security field. We share the same observation: one impactful event usually has signs, precursors, and minor episodes ahead of the day of the avalanche. Many times, sadly enough, the problem is known by the insiders, but was not properly addressed by the principal.

There are a few common catalysts or accelerators:

  • Human mistake: can be the trigger but also be the failure of defense. This can be combined with lack of resource, lack of training, fatigue, and low morale.
  • Political agenda: sets up the background to the main event. The extra push, the unwise delay, the convenient ignore all could be the contributor.
  • Unusual pattern: like bad weather to a flight, a sudden surge of usage can expose the problem not common to the known state.

Here are some suggestions to prevent disaster pyramid from happening:

  • Identify and protect high value asset. Not every system makes the same damage. Understand what is important to you.
  • Strive for quality. Don’t let buzz words replace the simple word “quality”. Limit what timeline or budget can compromise on quality. Have the courage to choose “no go”.
  • Nurture a culture of “accountability”. Where everyone is a piece of the puzzle, we do need every piece to complete the puzzle.
  • Have a strong incident management program. Don’t let the 30 bubble up to 1. Do good root cause analysis, and act on it. Patch as soon as possible and fix as complete as possible. Incur minimum technical debt.
  • Focus DevSecOps. Link the 3 elements together. If Sec and Ops see a dev problem, report, verify and fix.
  • Imagine what is the worst scenario, then plan, drill, and have a solid continuity process and staff ready.
  • Lastly, training and communication. It can reduce the 1 on the top of the pyramid to just a fraction of it.

I wish some parts of this post useful.