Skip to content
This repository has been archived by the owner on Nov 23, 2018. It is now read-only.

Database design #94

Open
whymarrh opened this issue Oct 13, 2013 · 24 comments
Open

Database design #94

whymarrh opened this issue Oct 13, 2013 · 24 comments
Assignees

Comments

@whymarrh
Copy link
Contributor

@MitMaro did up a design for the database:

abadatabaseeer

Workbench File: http://www.cs.mun.ca/~oram/ABADatabaseEER.mwb

@whymarrh
Copy link
Contributor Author

This will impact issues #29, #48, #68, and #69 at least.

@MitMaro
Copy link
Member

MitMaro commented Oct 13, 2013

This fits with #29 perfectly, or at least how I plan to implement the algorithm.

I think #48 is just a naming issue. What I called Calendar is currently called Schedule. I actually agree to the word schedule over calendar. The only other change is the removal of available, which might fix performance issues from #48.

#68 is taken care of in this design and is the reason for the user_name table.

#69 is taken care of as terms of service is a integer. It's hard to see that detail in the diagram but all the information is included in the MySql Workbench file I created.

@ghost ghost assigned MitMaro Oct 13, 2013
@whymarrh
Copy link
Contributor Author

I don't mean "impact" in a negative sense, just as a link to the relevant issues.

@MitMaro
Copy link
Member

MitMaro commented Oct 13, 2013

It also references #63, #66, #45 and #15. The database hits everything which is why it needs to be built well.

@whymarrh
Copy link
Contributor Author

In the event that the salt column is related to the user's password, I would note that if we use Bcrypt (which is what's in the codebase now), the salt is stored as part of the password hash.

@MitMaro
Copy link
Member

MitMaro commented Oct 13, 2013

I wasn't sure if PHP had Bcrypt. If that's the case I will cut that from the user.

@whymarrh
Copy link
Contributor Author

What is the desired difference between name and username? Are we storing the actual/real names of users? Also, is it a good idea/possible to combine certificate_of_conduct and certificate_of_conduct_date into one column, and have NULL represent a user not having their Certificate of Conduct -- if a user has one, there will be a date on it.

@MitMaro
Copy link
Member

MitMaro commented Oct 14, 2013

user_name is the users display name and not username. We currently don't have a username.

The issue with one column for certificate of conduct is how do people insert a non-dated certificate which is something we talked about doing. Though we can require the date.

@whymarrh
Copy link
Contributor Author

I think it would be best to require a date for each Certificate of Conduct, as I imagine that they have a date on the page itself -- in the event that is common to not know the date of receipt, this no longer works, but I do think this is worth looking into. It would also be in the best interest of the parent when looking for therapists to not have results show up without a date.

Having a date on all certs. would also allow us (in future iterations) to notify a user with a cert. that is x years old.

For the usernames/display names: if I understand correctly, we are currently using email addresses as the first half of the user's login credentials -- what usernames would be. Do we need to switch to usernames? I don't see it as having much benefit, as it is another field in the registration form and another thing for the user to remember.

Thoughts?

@binny45
Copy link
Contributor

binny45 commented Oct 14, 2013

The only thing with requiring a date on the certificate is that it would exclude newer potential ABA therapists who haven't received them yet.

With regard to usernames, as long as the email address doesn't identify the user (like mine [email protected]), I'd be just as cool with email as a user name as the rest. But how risky could this be?

@MitMaro
Copy link
Member

MitMaro commented Oct 14, 2013

There was a username field in user that I didn't mean to add.

I have renamed user_name to user_display_name

I have renamed user_type to account_type and added a many-to-many relationship between user and account_type called user_account_type.

Uploading the new image and file now.

@MitMaro
Copy link
Member

MitMaro commented Oct 14, 2013

It would be possible to not have a certificate of conduct. What we are not sure about is when a person says yes to having one do we require a date?

@binny45
Copy link
Contributor

binny45 commented Oct 14, 2013

If yes, then require date.

@whymarrh
Copy link
Contributor Author

@binny45

The email address is the best best to properly validate users when they register, so it's essentially a required field for us to have. The choice comes down to whether we want to have another parallel field, a username, that the user would use when logging in. But because nobody other than the user should ever see the email address, it's a fairly safe choice to have just that.

About the Certificates, a value of NULL would be reserved for those who have yet to receive it.

@binny45
Copy link
Contributor

binny45 commented Oct 14, 2013

Sounds good to me :)

@whymarrh
Copy link
Contributor Author

How would we feel about moving aba_course and certificate_of_conduct into a new table (maybe called user_data or something similar) as they are the only two columns that are super-specific to this application. This way the schema can be made generic by just removing that table.

@binny45
Copy link
Contributor

binny45 commented Oct 14, 2013

Sounds good. Would keep it out of the way until needed.

@whymarrh
Copy link
Contributor Author

Just as a follow up to our discussion on Tuesday, here are the changes I can remember. Names with a strikethrough are columns we decided to (re)move, all the other names are columns which we decided to add:

user_display_name
active
user
aba_course
conduct_certificate
conduct_certificate_date
phone_number
user_gender_id
user_data
gender*
phone_number
certificate_of_conduct
aba_course
  • I think we decided that gender will be either a VARCHAR, CHAR or something similar. We don't need the extra lookup table for this. It could be a single character: M, F, or NULL.
user_gender
user_gender_id
name
user_schedule
enabled
name
location
city_town*
  • I'm not 100% sure that city_town is a useful piece of data. As we discussed, it's for edge cases where one postal code is in multiple cities. But in such a case, will knowing the city be beneficial? I'm not sure if Google Maps will consider the city when given just that and the postal code.

As an aside, I think we should rename the tables to plural versions of the items, e.g. users table, schedules table. I (opinion) think this is a lot clearer when writing a query: select * from users. I also think that we should call the primary key columns in each table other than the join tables id (or uuid should we decide to choose that option).

Finally, I think that location_distance should be a join or lookup table more so than a secondary table (?) and that the join tables should have foreign keys in place of what they currently have as primary keys.

@whymarrh
Copy link
Contributor Author

Related to #52 as well.

@whymarrh
Copy link
Contributor Author

How does this look: see Gist.

@whymarrh
Copy link
Contributor Author

whymarrh commented Nov 5, 2013

Which IDs are UUIDs and which are integers?

@whymarrh
Copy link
Contributor Author

whymarrh commented Nov 5, 2013

Does the schedule table require a relationship with the schedule_interval table? Maybe a many-to-many relationship?

i.e. How would one get all the intervals given only a schedule? The Schedule entity would need to have an array of some sort. With an array of schedules, you could just traverse the array and do whatever action with each of the intervals.

Edit: I think the current relationship is backwards.

@MitMaro
Copy link
Member

MitMaro commented Nov 8, 2013

The only UUIDs are the account id and the user id.

The current relationship between schedule and schedule_interval is a one-to-many. Every schedule has many schedule intervals. I don't understand the problem.

To get the intervals for a schedule you would do a query along the lines of:

SELECT * FROM schedule_interval
JOIN schedule ON schedule.id = schedule_interval.schedule_id
WHERE schedule.id = ?

It would be stored as an array in the Schedule entity.

I think you are thinking that schedule_interval is unique on start_time, end_time and week_day but it's not. It's unique on start_time, end_time, week_day and schedule_id.

It would be possible to make it a many-to-many relationship but for very little gain. I don't think it's really worth it as many-to-many relationships can be a pain to deal with.

@whymarrh
Copy link
Contributor Author

whymarrh commented Nov 8, 2013

Fair points all around. My confusion came from where the array of intervals was to be stored, and as you clarified, it would be stored in the Schedule entity.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants