Page MenuHomeMiraheze

(access request) MacFan for ops
Closed, InvalidPublic

Description

I have decided hat I would like to request ops. I feel that I could be more useful as ops. I could help with things such as mw upgrades and other software updates and also other ops tasks. Also sometimes, there have been tasks that i am interested in, but couldn't do anything since i wasn't ops. I have been an mw-admin for almost a year now, and I am also a puppet-user. Below is a list of what I do and don't know


Mail I currently run a mail server and I have been using our webmail app, roundcube, since before miraheze installed it.

icinga I currently am running an instance of icingabot, icinga2, and icingaweb2, and I have gotten fairly familiar with its functions.

Phabricator I have a phab instance, which I have done upgrades on before, and I am fimilar with phabs command line interface and the web admin ui

Matomo I have a matomo instance. I am familiar with the "superuser interface" I am also familiar with how to update it.

mariadb I have been using sql.php for things like SELECT, UPDATE, SET, DELETE with more access I could do things like DROP

mediawiki I am an mw-admin and I plan to keep doing mw work

ssl/dns i am a puppet-user and plan to keep doing ssl work. I have had some commits to dns

debian I am fimiliar with the apt commands and things like dist-upgrade. My server runs debian

Further I have used ramnode before and am familiar with their web services

not as much knowlodge I dont know much about varnish, niginx, and puppet, bur Ive read the docs and have had plenty of commits to the puppet repo

repocommits
mw-config346
pupprt52
ssl48
dns16

Event Timeline

Reception123 triaged this task as Normal priority.Feb 27 2019, 18:25

I would not have an issue with Ops but I feel like we need to do possibly have a groups restructuring and a removal/replacement of mw-admins.

Since ManageWiki has come, a need for a specific staff for mw-admins is unjustified in my opinion, as their only job would be to look at logs and to run maintenance scripts really, with other minor tasks.

However, Ops should have a certain role of management and decision making, and the fact that all staff would become Ops is something unusual. That's why I think that we should do a restructuring of roles, and perhaps even though someone would have access to all servers, a name of "ops" would not necessarily be given to every user. There should probably be a separate discussion about this, but with this access request coming I think now would be the time for that.

Hi, what stuff did you want to work on but couldn't?

Secondly you have small amount of commits in the other repo's that need ops access.

You also need knowledge in puppet seeing as our entire infrastructure is around puppet and not just the software that we use.

Also what software do you use for your mail?

We have our icinga setup to use puppet (so you cannot manually edit a config).

You also don't mention which tasks you could do right now with ops status.

Hi, what stuff did you want to work on but couldn't?

Things like renewing wildcard certs, tasks involving dropping DB’s

Secondly you have small amount of commits in the other repo's that need ops access.

Well I don’t have much need to commit to things like puppet if I’m not ops

You also need knowledge in puppet seeing as our entire infrastructure is around puppet and not just the software that we use.

I know enough to be able to depool cache proxies and change the configure of different services

Also what software do you use for your mail?

Dovecot and postfix

We have our icinga setup to use puppet (so you cannot manually edit a config).

Note that I’ve done PR’s to edit the icings configure before (such as notifications)

You also don't mention which tasks you could do right now with ops status.

There isn’t anything with the ops tag that I know I can do right now, but there is sometimes

For "I know enough to be able to depool cache proxies and change the configure of different services" we have varnish-admin group for that :).

For "I know enough to be able to depool cache proxies and change the configure of different services" we have varnish-admin group for that :).

That is part of what I am saying, we need to reorganize these groups and clarify the role of Operations.

For "I know enough to be able to depool cache proxies and change the configure of different services" we have varnish-admin group for that :).

That is part of what I am saying, we need to reorganize these groups and clarify the role of Operations.

That group also hasn’t been used in a few years

In order to assess ability and capabilities in a real situation, please answer the following questions to the best of your abilities.

  1. How would you determine who is logged in on a server?
  1. How would you investigate network connectivity problems between two servers?
  1. What port do commands like ping and traceroute use?
  1. If you had the opportunity to implement software which had to transfer data across servers, which transport layer would you use and why?
  1. What would you do if Icinga reported the MySQL process on db4 as CRITICAL?
  1. https://meta.miraheze.org/ starts showing a 503, how would you handle it?
  1. Why do we have 2 certain servers in the US? What are the servers I am likely referreing to in this question?
  1. What email authentication protocols/methods does Miraheze deploy?
  1. You notice suspicious activity on puppet1. After investigation you find it is not an operations member. No one else is available and you are not able to figure out how access was gained. What do you do?
  1. Users from Asia are reporting they can not access Miraheze. What would you do?
  1. A DNS request occasionally returns an out of date record. How would you debug which server was returning this?
  1. How do you add a new mail account for someone?
  1. Images begin to stop showing, what would you do?
  1. A large number of users begin to complain they are unable to login. What would you do?
In T4148#79522, @John wrote:

In order to assess ability and capabilities in a real situation, please answer the following questions to the best of your abilities.

  1. How would you determine who is logged in on a server?

commands: w or who

  1. How would you investigate network connectivity problems between two servers?

I would ping the servers from each other and once I find the problem, open a ticket with RamNode

  1. What port do commands like ping and traceroute use?

afaik ping doesn’t use a port and for traceroute, the port is specified with -p

  1. If you had the opportunity to implement software which had to transfer data across servers, which transport layer would you use and why?

I don’t exactly know

  1. What would you do if Icinga reported the MySQL process on db4 as CRITICAL?

I would check the logs, and attempt to restart mariadb. If it fails, I would make sure that it goes into recovery mode

  1. https://meta.miraheze.org/ starts showing a 503, how would you handle it?

I would check the logs + icinga and debug from there

  1. Why do we have 2 certain servers in the US? What are the servers I am likely referreing to in this question?

Cp2 and bacula1. Cp2 handles US traffic

  1. What email authentication protocols/methods does Miraheze deploy?

SASL

  1. You notice suspicious activity on puppet1. After investigation you find it is not an operations member. No one else is available and you are not able to figure out how access was gained. What do you do?

I would try and strengthen security measures where possible.

  1. Users from Asia are reporting they can not access Miraheze. What would you do?

I would investigate and debug cp5

  1. A DNS request occasionally returns an out of date record. How would you debug which server was returning this?

I would check puppet on misc1 and ns1 and if needed, debug gdnsd

  1. How do you add a new mail account for someone?

Create a shell account on misc1 and set the password (useradd, adduser, passwd)

  1. Images begin to stop showing, what would you do?

Debug lizzardfs and check logs

  1. A large number of users begin to complain they are unable to login. What would you do?

Debug and check logs for redis

“Debug” isn’t really a satisfactory answer. Details need to know, where will you look, why, what steps will you do.

  1. The servers are ns1 and bacula1 I was referring to. There is a reason these are not in the Netherlands - why?
  1. That is not an email authentication protocol/method.
  1. How can you strengthen something you don’t know is weak? Would you take no other actions at all?
  1. I’m looking for a reply which requires no server access at all.

I have no direct concerns myself, although you mentioned there have been instances where expanded access would have allowed you to resolve issues yourself, where there are currently no such instances.

What are some specific tasks that are already closed that you would have liked to handle yourself if you had expanded access?

Also, I'm concerned by some of the answers to John's questions, but I'll wait to see your reply to his followups.